Rigorous characterization of potential energy surfaces, with the goal of determining rate coefficients involves running a large number of quantum chemistry calculations.
The manual preparation of input files for such calculations and the analysis of the results is a tedious and error-prone task, and relies too much on chemical intuition.
References
• R. Van de Vijver, J. Zádor, Comp. Phys. Comm., 2019, 248, 106947.
• J. Zádor, C. Martí, R. Van de Vijver, S. L. Johansen, Y. Yang, H. A. Michelsen, H. N. Najm, J. Phys. Chem. A, 2023, 127, 565–588.
Once set, the level assignments should not be changed.
The reactions that a given species may undergo are based on the identification of patterns in it. These are grouped in reaction families similar to the RMG ones.
Then, via a series of constrained optimization steps, a TS candidate is generated followed by an actual saddle-point optimization.
We will generate a random molecule or radical.
Draw and convert to smiles:
https://www.rcsb.org/chemical-sketch or
https://pubchem.ncbi.nlm.nih.gov//edit3/index.html
Input file demo.json
- to be run on one of our clusters.
{
"title": "demo",
"smiles": NEEDED,
"charge": 0,
"mult": NEEDED,
"barrier_threshold": NEEDED,
"skip_families": ["hom_sci"],
"qc": "gauss",
"calcall_ts": 1,
"queuing":"slurm",
"queue_name":"medium",
"queue_template": "qu.tmp",
"queue_job_limit": 100,
"ppn": 8,
"scratch": "/scratch/jzador",
"username": "jzador"
}
To run:
kinbot demo.json & disown
Usually it is advisable to create a dedicated conda environment.
KinBot works only with Python version ≥ 3.8.
This installation creates the conda enviroment and installs KinBot, PESViewer and all their dependencies.
Download kb_env.yml
yaml file from here.
conda env create -f kb_env.yml
You need to run this command from the directory where you downloaded kb_env.yml
or provide the path to the file.
Installation might take several minutes to complete.
Alternatively, you can install just KinBot and its dependencies with one of the following:
conda install -c conda-forge kinbot
pip install kinbot
git clone git@github.com:zadorlab/KinBot.git
cd KinBot/
pip install -e .
OpenBabel and RDKit can be a bit trickier to install with the pip installer.
Refer to their documentation for help.
Python dependencies are pulled in by any of the two installers (pip or conda). The most important dependencies:
The input parameters for KinBot are provided in a *.json
file set up by the user. The detailed description of all of the keywords that can be specified in the input is in the wiki. Almost all keywords have sensible defaults.
Here is a basic example, propyl.json
, that aims to look at the isomerization and decomposition of the propyl radical.
{
"title": "propyl",
"structure": [
"C", 1.27, 0.23, -0.45,
"C", 0.19, -0.07, 0.53,
"C", -1.14, 0.11, -0.18,
"H", 2.22, 0.33, 0.08,
"H", 1.26, -0.54, -1.23,
"H", 0.32, -1.11, 0.88,
"H", 0.17, 0.58, 1.40,
"H", -1.14, -0.36, -1.18,
"H", -1.26, 1.19, -0.30,
"H", -1.89, -0.37, 0.45
],
"charge": 0,
"mult": 2,
"qc": "gauss",
"method": "b3lyp",
"basis": "6-31+g*",
"queuing": "local"
}
Except for quick studies or specific cases, it is advisable to provide the initial geometry in Cartesian coordinates via the "structure"
keyword (as opposed to SMILES). The initial geometry does not need to be an optimized one, just something that comes out from a drawing program, e.g., Avogadro.
For the purposes of this tutorial we set "queuing": "local"
, which performs a dry run without submitting any calculations. The quantum chemistry code output files and kinbot.db
need to be present when running in this mode.
For real calculations "queuing"
needs to be set to "pbs"
, "slurm"
or "fireworks"
and appended with additional keywords controling the computational environment.
Among these keywords, probably the most important is "queue_template"
which provides a template for constructing job submission files.
#! /bin/bash
#SBATCH -N 1
#SBATCH -c {ppn}
#SBATCH -q {queue_name}
#SBATCH -o {errdir}/{name}.stdout
#SBATCH -e {errdir}/{name}.err
#SBATCH --mem=10gb
{slurm_feature}
export OMP_NUM_THREADS={ppn}
Additionally, the keywords "q_temp_am1"
, "q_temp_hi"
, "q_temp_mp2"
, "q_temp_l3"
, allow for further flexibility by specifying different templates to different kind of jobs.
Before running, remember to be in the correct environment if you have decided to create one. In the case of a conda environment:
conda activate kinbot
Then, you can run KinBot by simply typing:
kinbot propyl.json & disown
Because KinBot execution can sometimes take days/weeks to finish, the appended & disown
detaches the KinBot process from the terminal and we are able to close the terminal window without killing the KinBot process.
KinBot typically runs on the login or head node, and requires relatively very little resources compared to all of the quantum chemical calculations running on the worker nodes. However, if needed, KinBot can also be submitted as a job.
Running KinBot on a cluster most often requires the customization of our queue submission templates (e.g., slurm or pbs). The customized template can be provided via the
"queue_template"
keyword in the *.json
input file. Further customization is also possible for the various types of calculations to better match the profile and policies of the HPC resource.
Let us know if your cluster needs further settings, we can most likely expand the code to accommodate such needs.
kinbot.log
: Contains a detailed, time-stamped logging of the events that happened during the runs. This file is supposed to be used first when questions arise about the calculations, such as, why a certain pathway is not present. It also contains the values of all input parameters for the given run.
26-Jul-23 16:00:25-INFO:
Copyright 2018 National Technology & Engineering Solutions of Sandia,
LLC (NTESS). Under the terms of Contract DE-NA0003525 with NTESS,
the U.S. Government retains certain rights in this software.
26-Jul-23 16:00:25-INFO: Input parameters
26-Jul-23 16:00:25-INFO: input_file: exercise1.json
title: propyl
verbose: 0
smiles:
structure: ['C', 1.27, 0.23, -0.45, 'C', 0.19, -0.07, 0.53, 'C', -1.14, 0.11, -0.18, 'H', 2.22, 0.33, 0.08, 'H', 1.26, -0.54, -1.23, 'H', 0.32, -1.11, 0.88, 'H', 0.17, 0.58, 1.4, 'H', -1.14, -0.36, -1.18, 'H', -1.26, 1.19, -0.3, 'H', -1.89, -0.37, 0.45]
charge: 0
mult: 2
bimol: 0
reaction_search: 1
families: ['all']
bimol_families: ['all']
skip_families: ['none']
skip_chemids: ['none']
keep_chemids: ['none']
skip_reactions: ['none']
variational: 0
barrierless_saddle: {}
barrierless_saddle_start: 2.0
barrierless_saddle_step: 0.2
homolytic_bonds: {}
specific_reaction: 0
break_bonds: []
form_bonds: []
barrier_threshold: 100.0
hom_sci_threshold_add: 5.0
scan_step: 30
pes: 0
simultaneous_kinbot: 5
high_level: 0
calc_aie: 0
conformer_search: 0
conf_grid: 3
semi_emp_conformer_search: 0
rotor_scan: 0
free_rotor_thrs: 0.1
nrotation: 12
plot_hir_profiles: 0
rotation_restart: 3
max_dihed: 5
random_conf: 500
flat_ring_dih_angle: 5.0
max_dihed_semi_emp: 5
random_conf_semi_emp: 500
semi_emp_confomer_threshold: 5
multi_conf_tst: 0
multi_conf_tst_temp: None
multi_conf_tst_boltz: 0.05
print_conf: 0
min_bond_break: 2
max_bond_break: 3
comb_molec: 1
comb_rad: 1
comb_lone: 1
comb_pi: 1
break_valence: 1
one_reaction_comb: 0
one_reaction_fam: 0
ringrange: [3, 9]
qc: gauss
methodclass: dft
qc_command: g16
method: b3lyp
basis: 6-31+g*
scan_method: mp2
scan_basis: 6-31G
barrierless_saddle_method: b3lyp
barrierless_saddle_basis: 6-31G
barrierless_saddle_method_high: b3lyp
barrierless_saddle_basis_high: 6-31G
calcall_ts: 0
high_level_method: M062X
high_level_basis: 6-311++G(d,p)
semi_emp_method: am1
integral:
opt:
irc_maxpoints: 30
irc_stepsize: 20
guessmix: 0
L3_calc: 0
single_point_qc: molpro
single_point_template:
single_point_key:
barrierless_saddle_single_point_template:
barrierless_saddle_prod_single_point_template:
barrierless_saddle_norbital: 0
barrierless_saddle_nelectron: 0
barrierless_saddle_nstate: 0
barrierless_saddle_single_point_key:
single_point_command:
hir_maxcycle: None
rigid_hir: 0
use_sella: False
imagfreq_threshold: 50.0
queuing: local
queue_template:
q_temp_am1:
q_temp_hi:
q_temp_mp2:
q_temp_l3:
queue_name: medium
slurm_feature:
ppn: 1
single_point_ppn: 4
zf: 4
delete_intermediate_files: 0
scratch:
username:
queue_job_limit: -1
me: 0
run_me: 0
me_code: mess
collider: He
epsilon: 0.0
epsilon_unit: K
sigma: 0.0
mess_command: mess
TemperatureList: [300.0, 400.0, 500.0, 600.0, 700.0, 800.0, 900.0, 1000.0, 1100.0, 1200.0, 1300.0, 1400.0, 1500.0, 1600.0, 1700.0, 1800.0, 1900.0, 2000.0]
PressureList: [7.6, 76.0, 760.0, 7600.0, 76000.0]
EnergyStepOverTemperature: 0.2
ExcessEnergyOverTemperature: 30
ModelEnergyLimit: 400
CalculationMethod: direct
ChemicalEigenvalueMax: 0.2
EnergyRelaxationFactor: 200
EnergyRelaxationPower: 0.85
EnergyRelaxationExponentCutoff: 15
mesmer_command: mesmer
uq: 0
uq_n: 1
uq_max_runs: 5
well_uq: 0.5
barrier_uq: 1.0
freq_uq: 1.2
imagfreq_uq: 1.1
epsilon_uq: 1.2
sigma_uq: 1.2
enrelfact_uq: 1.2
enrelpow_uq: 1.2
pstsymm_uq: 2.0
26-Jul-23 16:00:25-INFO: Starting KinBot
26-Jul-23 16:00:28-INFO: Starting optimization of initial well...
26-Jul-23 16:00:28-INFO: Starting reaction search...
26-Jul-23 16:00:28-INFO: Found the following reactions:
26-Jul-23 16:00:28-INFO: 431391510850120000002_intra_H_migration_1_6
26-Jul-23 16:00:28-INFO: 431391510850120000002_intra_H_migration_1_8
26-Jul-23 16:00:28-INFO: 431391510850120000002_intra_R_migration_1_3
26-Jul-23 16:00:28-INFO: 431391510850120000002_Intra_R_Add_ExoTetCyclic_F_1_3_8
26-Jul-23 16:00:28-INFO: 431391510850120000002_r12_insertion_R_1_2_3
26-Jul-23 16:00:28-INFO: 431391510850120000002_r12_insertion_R_1_2_6
26-Jul-23 16:00:28-INFO: 431391510850120000002_r12_insertion_R_2_1_4
26-Jul-23 16:00:28-INFO: 431391510850120000002_r12_insertion_R_2_3_8
26-Jul-23 16:00:28-INFO: 431391510850120000002_r12_insertion_R_3_2_1
26-Jul-23 16:00:28-INFO: 431391510850120000002_r12_insertion_R_3_2_6
26-Jul-23 16:00:28-INFO: 431391510850120000002_r12_insertion_R_4_1_2
26-Jul-23 16:00:28-INFO: 431391510850120000002_r12_insertion_R_6_2_1
26-Jul-23 16:00:28-INFO: 431391510850120000002_r12_insertion_R_6_2_3
26-Jul-23 16:00:28-INFO: 431391510850120000002_r12_insertion_R_8_3_2
26-Jul-23 16:00:28-INFO: 431391510850120000002_R_Addition_MultipleBond_1_2_3
26-Jul-23 16:00:28-INFO: 431391510850120000002_R_Addition_MultipleBond_1_2_6
26-Jul-23 16:00:28-INFO: 431391510850120000002_h2_elim_4_6
26-Jul-23 16:00:28-INFO: 431391510850120000002_h2_elim_6_8
26-Jul-23 16:00:28-INFO: 431391510850120000002_hom_sci_1_2
26-Jul-23 16:00:28-INFO: 431391510850120000002_hom_sci_1_4
26-Jul-23 16:00:28-INFO: 431391510850120000002_hom_sci_2_3
26-Jul-23 16:00:28-INFO: 431391510850120000002_hom_sci_2_6
26-Jul-23 16:00:28-INFO: 431391510850120000002_hom_sci_3_8
26-Jul-23 16:00:30-INFO: Rxn search failed for 431391510850120000002_R_Addition_MultipleBond_1_2_6
26-Jul-23 16:00:30-INFO: Reaction 431391510850120000002_hom_sci_1_2 leads to products 140260020000000000003 290890730120000000002
26-Jul-23 16:00:30-INFO: Reaction 431391510850120000002_hom_sci_1_4 leads to products 421261360680060000001 10000000000000000002
26-Jul-23 16:00:30-INFO: Reaction 431391510850120000002_hom_sci_2_3 leads to products 280760560080000000001 150390060000000000002
26-Jul-23 16:00:30-INFO: Reaction 431391510850120000002_hom_sci_2_6 leads to products 421261230750120000001 10000000000000000002
26-Jul-23 16:00:30-INFO: Reaction 431391510850120000002_hom_sci_3_8 leads to products 421261340680080000001 10000000000000000002
26-Jul-23 16:00:31-INFO: a) Product optimized to other structure for 431391510850120000002_hom_sci_3_8, product 421261340680080000001 to 421261230750120000001
26-Jul-23 16:00:32-INFO: Rxn search failed for 431391510850120000002_h2_elim_6_8, no imaginary freq.
26-Jul-23 16:00:34-INFO: IRCs successful for 431391510850120000002_intra_H_migration_1_6
26-Jul-23 16:00:34-INFO: Both IRCs lead to the initial well, identical reaction found: 431391510850120000002_intra_H_migration_1_8
26-Jul-23 16:00:34-INFO: No product found for 431391510850120000002_intra_H_migration_1_8
26-Jul-23 16:00:34-INFO: Both IRCs lead to the initial well, identical reaction found: 431391510850120000002_intra_R_migration_1_3
26-Jul-23 16:00:34-INFO: No product found for 431391510850120000002_intra_R_migration_1_3
26-Jul-23 16:00:34-INFO: IRCs successful for 431391510850120000002_Intra_R_Add_ExoTetCyclic_F_1_3_8
26-Jul-23 16:00:34-INFO: Both IRCs lead to the initial well, identical reaction found: 431391510850120000002_r12_insertion_R_1_2_3
26-Jul-23 16:00:34-INFO: No product found for 431391510850120000002_r12_insertion_R_1_2_3
26-Jul-23 16:00:34-INFO: Rxn barrier too high (107.19 kcal/mol) for 431391510850120000002_r12_insertion_R_1_2_6
26-Jul-23 16:00:34-INFO: IRCs successful for 431391510850120000002_r12_insertion_R_2_1_4
26-Jul-23 16:00:35-INFO: IRCs successful for 431391510850120000002_r12_insertion_R_2_3_8
26-Jul-23 16:00:35-INFO: Both IRCs lead to the initial well, identical reaction found: 431391510850120000002_r12_insertion_R_3_2_1
26-Jul-23 16:00:35-INFO: No product found for 431391510850120000002_r12_insertion_R_3_2_1
26-Jul-23 16:00:35-INFO: IRCs successful for 431391510850120000002_r12_insertion_R_3_2_6
26-Jul-23 16:00:35-INFO: Rxn barrier too high (105.42 kcal/mol) for 431391510850120000002_r12_insertion_R_4_1_2
26-Jul-23 16:00:35-INFO: IRCs successful for 431391510850120000002_r12_insertion_R_6_2_1
26-Jul-23 16:00:35-INFO: IRCs successful for 431391510850120000002_r12_insertion_R_6_2_3
26-Jul-23 16:00:36-INFO: Neither IRC leads to the initial well for 431391510850120000002_r12_insertion_R_8_3_2
26-Jul-23 16:00:36-INFO: No product found for 431391510850120000002_r12_insertion_R_8_3_2
26-Jul-23 16:00:36-INFO: Rxn barrier too high (108.87 kcal/mol) for 431391510850120000002_R_Addition_MultipleBond_1_2_3
26-Jul-23 16:00:36-INFO: Neither IRC leads to the initial well for 431391510850120000002_h2_elim_4_6
26-Jul-23 16:00:36-INFO: No product found for 431391510850120000002_h2_elim_4_6
26-Jul-23 16:00:37-INFO: Reaction 431391510850120000002_intra_H_migration_1_6 leads to products 431391400900180000002
26-Jul-23 16:00:37-INFO: Reaction 431391510850120000002_Intra_R_Add_ExoTetCyclic_F_1_3_8 leads to products 421502341800240000001 10000000000000000002
26-Jul-23 16:00:37-INFO: Reaction 431391510850120000002_r12_insertion_R_2_1_4 leads to products 130130000000000000002 301020900180000000001
26-Jul-23 16:00:37-INFO: Reaction 431391510850120000002_r12_insertion_R_2_3_8 leads to products 280760560080000000001 150390060000000000002
26-Jul-23 16:00:37-INFO: Reaction 431391510850120000002_r12_insertion_R_3_2_6 leads to products 280760560080000000001 150390060000000000002
26-Jul-23 16:00:37-INFO: Reaction 431391510850120000002_r12_insertion_R_6_2_1 leads to products 421261230750120000001 10000000000000000002
26-Jul-23 16:00:37-INFO: Reaction 431391510850120000002_r12_insertion_R_6_2_3 leads to products 421261230750120000001 10000000000000000002
26-Jul-23 16:00:44-INFO: Reaction generation done!
26-Jul-23 16:00:44-INFO: KinBot finished.
Upon rerunning, KinBot automatically backs the previous kinbot.log
file up renaming it to append a time-stamp of the initial submission.
431391510850120000002
).431391400900180000002_r12_insertion_R_1_2_3
.summary_[chemid].out
: A reaction-by-reaction summary of the outcome of each search written at the end of the kinbot execution. Deleted on restart and written on successful overall run completion.
SUCCESS 41.04 431391400900180000002_intra_H_migration_2_4 431391510850120000002
FAILED 431391400900180000002_r12_insertion_R_1_2_3
FAILED 431391400900180000002_r12_insertion_R_1_2_7
FAILED 431391400900180000002_r12_insertion_R_2_1_4
FAILED 431391400900180000002_r12_insertion_R_4_1_2
FAILED 431391400900180000002_r12_insertion_R_7_2_3
SUCCESS 41.15 431391400900180000002_R_Addition_MultipleBond_2_1_4 10000000000000000002 421261230750120000001
FAILED 431391400900180000002_h2_elim_4_7
kinbot_monitor.out
: contains a real-time progress variable of each reaction search.
FAILED
means a range of things, it can be a saddle w/o the correct IRC, an unconverged saddle optimization, or a barrier that is simply too high.
Note that KinBot in this mode keeps duplicate channels if their energies differ by more than 1 kcal/mol.
If no conformational search is not requested, it can happen for channels that are chemically identical. However, some duplicates are real, e.g., H atom transfer in 2-butyl to 1-butyl can happen over a 1,2 or a 1,3 channel.
kinbot.db
: this is an SQLite database, written with ASE's database wrapper (ase.db
), that contains the results of essentially all quantum chemical calculations that were carried out ever in the current folder. When duplicates are found, KinBot always retrieves the last entry.
If you delete
kinbot.db
, it will trigger KinBot to recalculate everything. Do NOT delete this file!
pesviewer.inp
: input file to the PESViewer utility, see later.
xyz
folder: contains the Cartesian coordinates of the species at the highest level of theory attained.
KinBot also retains all of the raw log files from the quantum chemistry codes, which can be inspected if needed. Remember, KinBot is NOT a black box code.
If you want to recalculate specific jobs on restart, simply delete the quantum chemistry code's log file, and KinBot will fill in the missing jobs.
However, if the deleted jobs had dependencies, they will not be automatically recalculated. If this is desired, you need to manually delete the whole chain of events. E.g., saddle optimization will not trigger resubmitting the related IRC calculations if the IRC output files are in place already.
PESViewer relies on RDKit to generate nice 2D representations for the structures.
conda install -c conda-forge pesviewer
git clone git@github.com:zadorlab/pesviewer.git
cd PESViewer/
pip install -e .
Noteworthy third-party Python dependencies (already pulled in by the conda installer):
If you created the environment using the
kb_env.yml
file you don't need to run the conda install
command.
pesviewer pesviewer.inp &
Stationary points and structures can be moved around to arrange the picture nicely.
png
file in the *_2d
directory with the desired image. Barrierless channels are simply channels that (a) have an asymptote in the desired energy range and (b) no barrier was found (in the energy range) for their formation. One should use more sophisticated calculations to confirm and characterize a barrierless channel.
pesviewer.inp
This is how pesviewer.inp
looks like, when created automatically by KinBot:
> <options>
title 0 # print a title (1) or not (0)
units kcal/mol # energy units
use_xyz 1 # use xyz, put 0 to switch off
rescale 0 # no rescale , put the well or bimolecular name here to rescale to that value
fh 9. # figure height
fw 18. # figure width
margin 0.2 # margin fraction on the x and y axis
dpi 120 # dpi of the molecule figures
save 0 # does the plot need to be saved (1) or displayed (0)
write_ts_values 1 # booleans tell if the ts energy values should be written
write_well_values 1 # booleans tell if the well and bimolecular energy values should be written
bimol_color red # color of the energy values for the bimolecular products
well_color blue # color of the energy values of the wells
ts_color green # color or the energy values of the ts, put to 'none' to use same color as line
show_images 1 # boolean tells whether the molecule images should be shown on the graph
rdkit4depict 1 # boolean that specifies which code was used for the 2D depiction
axes_size 10 # font size of the axes
text_size 10 # font size of the energy values on the graph
graph_edge_color black # color of graph edge, if set to 'energy', will be scaled accordingly
> <wells>
431391510850120000002 0.0
431391400900180000002 -4.32
> <bimolec>
10000000000000000002_421502341800240000001 26.82
130130000000000000002_301020900180000000001 78.26
150390060000000000002_280760560080000000001 19.02
10000000000000000002_421261230750120000001 33.46
140260020000000000003_290890730120000000002 76.00
10000000000000000002_421261360680060000001 92.47
> <ts>
431391510850120000002_intra_H_migration_1_6 40.62 431391510850120000002 431391400900180000002
431391510850120000002_Intra_R_Add_ExoTetCyclic_F_1_3_8 60.06 431391510850120000002 10000000000000000002_421502341800240000001
431391510850120000002_r12_insertion_R_2_1_4 91.64 431391510850120000002 130130000000000000002_301020900180000000001
431391510850120000002_r12_insertion_R_2_3_8 27.42 431391510850120000002 150390060000000000002_280760560080000000001
431391510850120000002_r12_insertion_R_6_2_1 35.64 431391510850120000002 10000000000000000002_421261230750120000001
> <barrierless>
431391510850120000002_hom_sci_1_2 431391510850120000002 140260020000000000003_290890730120000000002
431391510850120000002_hom_sci_1_4 431391510850120000002 10000000000000000002_421261360680060000001
It's easy to edit this file to exclude unwanted pathways or even add new pathways.
In pyrrolidine_oxi.json
we demonstrate some of the other features of KinBot that allow fine-tuning the search, plus the ability to deal with heteroatoms.
We are now going to run KinBot in the pes
mode.
kinbot
mode: single-well mode, explores pathways from a well and then it stops.pes
mode: multi-well mode, launches new kinbot
calculations from each of the successfully characterized unimolecular products obtained during other kinbot
runs.For the pes
exploration, go to the exercise2
directory in the downloaded files and simply type:
pes exercise2.json & disown
In the following example KinBot is executed in PES mode ("pes": 1
), and it is set to run a conformer search ("conformer_search": 1
) and a hindered rotor scan ("rotor_scan": 1
) for each of the stationary points found.
Barriers higher than 36 kcal/mol at L1 are discarded ("barrier_threshold": 36.0
). Finally, only one homolytic scission, the one between atoms 2 and 13 is attempted (this is the C–OO bond, zero-indexed).
{
"title":"Pyrrolidine_oxidation",
"structure": [
"C", -0.80, -0.94, -0.08,
"N", -1.05, 0.49, -0.19,
"C", 0.15, 1.21, -0.18,
"C", 1.20, 0.33, 0.42,
"C", 0.73, -1.06, -0.02,
"H", -1.22, -1.49, -0.94,
"H", -1.27, -1.34, 0.81,
"H", 0.09, 2.29, -0.04,
"H", -1.78, 0.80, -0.81,
"H", 1.21, 0.40, 1.52,
"H", 2.21, 0.57, 0.07,
"H", 1.06, -1.86, 0.65,
"H", 1.13, -1.28, -1.01,
"O", 0.83, 1.42, -1.36,
"O", 1.88, 2.08, -1.13
],
"charge" : 0,
"mult" : 2,
"pes" : 1,
"conformer_search" : 1,
"rotor_scan" : 1,
"barrier_threshold" : 36.0,
"homolytic_bonds": {"1022904475485954882032": [[2, 13]]},
⋮
KinBot uses three levels of theory:
L1: Saddle point and conformational searches.
L2: Refinement of L1 geometries and rovibrational properties, including hindered rotor scans.
L3: Refinement of electronic energies on L2 geometries with a post-Hartree-Fock method.
Here L1 is set to be B3LYP/6-31+G* while L2 refinement is activated with "high_level": 1
. The related L2 method and basis set are specified with the keywords "high_level_method"
and "high_level_basis"
(ωB97X-D/6-311++G(d,p)).
⋮
"high_level" : 1,
"qc" : "gauss",
"method" : "b3lyp",
"basis" : "6-31+g*",
"high_level_method" : "wB97XD",
"high_level_basis" : "6-311++G(d,p)",
⋮
⋮
"L3_calc": 1,
"single_point_qc": "molpro",
"single_point_command": "molpro -n 4 -d /scratch/USER",
"single_point_key": "MYTZA",
"single_point_template": "molpro.tpl",
"single_point_ppn" : 4,
⋮
Note that the scratch here is pointing to a directory of USER - needs to be changed to the appropriate location on your machine.
MYTZA
is a key that KinBot is going to look for in the molpro output file, and should be present in the molpro.tpl
template file.
Due to the high cost of L3 calculations, KinBot does NOT automatically submit the jobs, but creates the submission scripts necessary to submit them and a bash script inside the
molpro
/orca
directory called batch_L3_XXX.sub
to submit them all.
The details of the method used are set in a separate template file ("single_point_template": "molpro.tpl"
) which KinBot fills with the data of the system automatically and looks like:
***,{name}
memory,3200,M
geomtyp=xyz
geometry={{
{natom}
{name}
{geom}
}}
{{uhf;wf,{nelectron},{symm},{spin},{charge}}}
basis=cc-pvtz-f12
rhf
CCSD(T)-F12
mytza = energy(1)
mytzb = energy(2)
----
While it is possible to add anything into this file as needed, the subsitutions "{}
" in this example are compulsory.
If a certain field is not needed, it can be moved after the
---
line in the template, but it needs to be present anyway in the file.
⋮
"semi_emp_conformer_search": 0,
"conf_grid": 3,
"max_dihed": 5,
"random_conf": 500,
"flat_ring_dih_angle": 5,
"max_dihed_semi_emp": 5,
"random_conf_semi_emp": 500,
"semi_emp_confomer_threshold": 5,
"multi_conf_tst": 0,
"multi_conf_tst_temp": 300.0,
"multi_conf_tst_boltz": 0.05,
"print_conf": 0,
⋮
Once conformer_search
is activated, it will do conformer search on a grid defined by conf_grid
, unless there are more than max_dihed
conformer generating rotors in a structure, in which case the code switches to random conformers. For more details, refer to the KinBot literature.
Once the
conf_grid
is defined, it cannot be changed on subsequent runs. All the conformer runs need to be redone (i.e., by deleting the entire conf
directory).
It is best to explore the PES first just at L1 without conformational search to understand the expected and needed computational costs.
"free_rotor_thrs": 0.1,
"nrotation": 12,
"plot_hir_profiles": 0,
"rotation_restart": 3,
1D hindered rotor scans are also done on a grid defined by the user.
Hindered rotor scans are compulsory when a master equation calculation is requested (unless doing multi-conformer TST) because of how KinBot determines symmetry numbers.
"me" : 1,
"epsilon" : 446.577,
"sigma" : 6.0,
"PressureList": [760.0],
"TemperatureList": [300.0, 400.0, 500.0, 600.0, 700.0, 800.0, 900.0, 1000.0]
"uq": 1,
"uq_n": 100
}
This final section of the input controls the parameters in the master equation and UQ. In this case, 100 different master equations are created with perturbed molecular properties. For details, see https://pubs.acs.org/doi/abs/10.1021/acs.jpca.2c06558.
Remember that the L3 template file (here
molpro.tpl
) needs to be present alongside the *.json
input file and, when running in production mode with "queuing"
set to something different than local
, the batch submission template (e.g., qu.tpl
) also needs to be present.
The master equation is automatically assembled, but has to be submitted manually (unless "run_me"
is set to 1
) to allow for adjustments, e.g., for barrierless reactions. In case of UQ, the first master equation input file (mess_0000.inp
) is the base case.
L3 energies are also to be submitted manually. The molpro
directory contains a batch submission file, which contains the submission of the missing species only.
Both the
pesviewer.inp
and the master equations always contain consistent energies, therefore, even if a single L3 value is missing, the code reverts to L2. The missing L3 energies are noted in the pes.log
file.
pes
runpesviewer pesviewer.inp &
It is possible to adjust the visualization in several ways:
display_units
: converts the units in which the data is provided to kJ/mol, kcal/mol or eVfs
: resizes structuresFor all available options see our GitHub page.
For some adjustments, such as
fs
, the files in the *_2d
directory and the generated *_im_extent.txt
and *_xval.txt
files need to be deleted for the new settings to take effect.
! Energies in this file are computed at the wB97XD/6-311++G(d,p) level of theory.
! **********************
TemperatureList[K] 300.0 400.0 500.0 600.0 700.0 800.0 900.0 1000.0
PressureList[torr] 760.0
EnergyStepOverTemperature 0.2
ExcessEnergyOverTemperature 30
ModelEnergyLimit[kcal/mol] 400
CalculationMethod direct
ChemicalEigenvalueMax 0.2
Reactant w_1
MicroRateOutput micro.out
MicroEnerMax[kcal/mol] 200.
MicroEnerMin[kcal/mol] 0.
MicroEnerStep[kcal/mol] 1.
Model
EnergyRelaxation
Exponential
Factor[1/cm] 200
Power 0.85
ExponentCutoff 15
End
CollisionFrequency
LennardJones
Epsilons[1/cm] 7.95 310.39
Sigmas[angstrom] 2.715 6.0
Masses[amu] 4.0 102.0
End
######################
# WELLS
######################
Well w_1 ! 1022904475485954882032 ! [H][N]1[C]([H])([H])[C]([H])([H])[C]([H])([H])[C]1([H])[O][O]
Species
RRHO
Geometry[angstrom] 15
C -0.402471 0.009971 -1.391455
N 0.065970 -0.271497 -0.034190
C 1.454566 -0.063543 -0.007047
C 1.972216 -0.528766 -1.361313
C 0.707927 -0.611139 -2.249284
H -0.489430 1.087488 -1.585717
H -1.377532 -0.449607 -1.560672
H 1.933095 -0.483937 0.877622
H -0.430357 0.189743 0.716849
H 2.467293 -1.495429 -1.266419
H 2.706900 0.185858 -1.736319
H 0.469606 -1.653312 -2.468870
H 0.831280 -0.088197 -3.197903
O 1.821274 1.408749 0.061183
O 1.210544 2.001001 1.038427
Core RigidRotor
SymmetryFactor 0.5
End
Frequencies[1/cm] 39
59.6 99.1 210.0
292.9 400.9 556.2
652.6 685.6 714.3
826.4 850.6 925.4
933.3 949.4 1010.2
1082.4 1116.7 1172.6
1215.8 1243.1 1263.7
1280.1 1309.4 1329.8
1344.4 1354.7 1418.8
1474.7 1488.8 1509.9
1537.3 3016.1 3083.3
3094.5 3111.2 3125.8
3132.4 3149.3 3616.0
ElectronicLevels[1/cm] 1
0. 2
ZeroEnergy[kcal/mol] 0.0
End ! RRHO
End ! Species
! ****************************************
⋮
Traditional PES visualizations have limitations. We developed, based on the pyvis
Python package, a customizable and transparent method to look at very complex PESs.
pesviewer c7h7.inp
The result is an html
file that can be opened in a browser.
Despite the advanced visualization, it can be desired to find lowest energy pathways between species more rigorously than just by looking through the pathways manually. In this case pesviewer is able to find the lowest energy path between two species in the graph and generates a new pesviewer.inp
file with the MEP between this species.
KinBot is fully restartable and tries to avoid doing any of the calculations that are already done. However, before restart, care must be taken to kill the jobs properly. The killing must happen in a top-down order:
Kill the pes
executable based on the PID or using pkill -f pes
.1
Kill all kinbot
executions, e.g., based on the PID or simply with pkill -f kinbot
.1
Cancel all the quantum chemistry jobs in the queue related to the processes previously killed.
1 Note that this will kill all user processes that contain
pes
/kinbot
substring.
Some parameters cannot be changed upon restart - these are marked on the wiki.
KinBot allows a very detailed control can be used to study extremely complex systems. There are several issues to be considered.
So far the only way we have constrained the search has been through the "barrier_threshold"
keyword that allows discarding saddle points above a given energy. Other ways to constrain the search are often needed or desired. The various techniques to do so are:.
families
or skip_families
keywords one can provide a list of (un)desired templates.skip_chemids
keyword eliminates the search from a given well. This should only be used after a cursory initial search, and is useful when some chemical complication arises for a well that cannot be handled otherwise. Note that keep_chemids
forces the system to keep only the requested wells in the system.ringrange
to limit the size of cyclic transition states The
simultaneous_kinbot
keyword limits the number of kinbot
calculations running parallel, but does not limit the final number of wells.
Instead of hindered rotors, for low-temperature applications multi-conformer TST is suited better.
Møller et al. JPCA 2020, 124 , 2885-2896
multi_conf_tst
to turn on this featuremulti_conf_tst_temp
to set the relevant temperature (300 K is default)multi_conf_tst_boltz
is to set the Boltzmann population to cut off high-energy conformers (0.05 default)KinBot can be used in a Jupyter notebook or similar Python interpreters as well to look at molecular properties. For instance using SMILES:
from kinbot.stationary_pt import StationaryPoint
water = StationaryPoint('test', 0, 0, 'O') # create a water molecule
water.characterize() # do all sorts of characterization
water.bonds # query its bonds
…or using geometry:
import numpy as np
myatom = np.array(['C', 'C', 'C', 'H', 'H', 'H', 'H', 'H', 'H', 'H', 'H'])
mygeom = np.array([[ 0.94485048, 0.0693553 , 0.01558594],
[ 0.45030974, 0.56172581, 1.36520909],
[ 0.94484208, 1.96854483, 1.656379 ],
[ 0.59026299, 0.71834224, -0.7914108 ],
[ 0.5795 , -0.94422848, -0.17728395],
[ 2.03879542, 0.04872775, -0.01633748],
[-0.64503931, 0.54825733, 1.38079305],
[ 0.7960085 , -0.11789557, 2.15186077],
[ 0.59025263, 2.67274435, 0.89708364],
[ 0.57948721, 2.30655403, 2.631211 ],
[ 2.03878679, 2.0031313 , 1.67215498]])
molec = StationaryPoint('test', 0, 0, geom=mygeom, atom=myatom)
molec.characterize()
print('bond matrices:', molec.bonds)
print('chemid: ', molec.chemid)
print('mass: ', molec.mass)
print('distance matrix: ', molec.dist)
print('cycles: ', molec.cycle)
print('dihedrals: ', molec.dihed)
print('conformer generating dihedrals: ', molec.conf_dihed)
print('unique atoms: ', molec.atom_uniq)
print('chirality labels: ', molec.chiral)
Theory developed by Stephen Klippenstein and others to treat barrierless reactions. Georgievskii a dn Klippenstein, J. Chem. Phys. 118, 5442–5455 (2003)
It is a sampling-based, variational method to count states using a series of dividing surfaces typically made up of intersecting spheres defined by pivot points.
Typically it requires the following workflow:
This is typically run on a restart - the user has to decide which barrierless reactions are important to be treated with VRC-TST. Barrierless reactions are typically homolytic scissions.
"vrc_tst_scan": {"781982302101340580261": ["781982302101340580261_hom_sci_4_5"]},
"vrc_tst_scan_points": [[1.5, 5.0, 0.2], [5.0, 20.0, 0.5]],
"vrc_tst_scan_method": "ub3lyp",
"vrc_tst_scan_basis": "6-31+G(d)",
"vrc_tst_sample_method": "caspt2(10,10)",
"vrc_tst_sample_basis": "vdz",
"vrc_tst_high_method": "caspt2(10,10)",
"vrc_tst_high_basis": "avtz",
Sella is an efficient and robust optimizer from our group. See https://github.com/zadorlab/sella.
see movie…
"pp_length": {"C": [0.5, 1.0, 1.5],
"X": [0.5, 1.0, 1.5]},
"pp_on_atom": [4, 12],
"rotdpy_dist": [[2.5, 5.0, 0.2], [5.0, 20.0, 0.5]],
Includes:
from rotd_py.fragment.nonlinear import Nonlinear
from rotd_py.fragment.linear import Linear
from rotd_py.system import Surface
from rotd_py.new_multi import Multi
from rotd_py.sample.multi_sample import MultiSample
from rotd_py.flux.fluxbase import FluxBase
from ase.atoms import Atoms
import numpy as np
def generate_grid(start, interval, factor, num_point):
"""Return a grid needed for the simulation of length equal to num_point
@param start:
@param interval:
@param factor:
@param num_point:
@return:
"""
i = 1
grid = [start]
for i in range(1, num_point):
start += interval
grid.append(start)
interval = interval * factor
return np.array(grid)
# temperature, energy grid and angular momentum grid
temperature = generate_grid(10, 10, 1.05, 80)
energy = generate_grid(0, 10, 1.05, 190)
angular_mom = generate_grid(0, 1, 1.1, 80)
# fragment info
# Coordinates in Angstrom
C3H3_0 = Nonlinear(
Atoms(['C', 'H', 'C', 'C', 'H', 'H'],
positions=[[-0.6152, 1.0113, 0.5708],
[-1.1147, 1.8324, 1.0343],
[-0.0392, 0.0644, 0.0363],
[0.6028, -0.991, -0.5594],
[0.4436, -1.2218, -1.6085],
[1.2852, -1.6203, 0.0043]])
)
C3H3_1 = Nonlinear(
Atoms(['C', 'H', 'C', 'C', 'H', 'H'],
positions=[[-0.6152, 1.0113, 0.5708],
[-1.1147, 1.8324, 1.0343],
[-0.0392, 0.0644, 0.0363],
[0.6028, -0.991, -0.5594],
[0.4436, -1.2218, -1.6085],
[1.2852, -1.6203, 0.0043]])
)
# Setting the dividing surfaces
# Pivot_points and distances are in Bohr
divid_surf = [
#Surface id: 0
#Fragment 0: pp oriented 0 (0.5 bohr) 3 (0.5 bohr)
#Fragment 1: pp oriented 0 (0.5 bohr) 3 (0.5 bohr)
#Distance: 2.5 Angstrom
Surface(pivotpoints={'0': np.array([[-1.5331328781538947, 1.5935479192172834, 1.187375410388408], [-0.8102874682552446, 2.2262142387503365, 0.9155184407876742], [1.5188123727445442, -1.571807981612902, -1.1808194683426916], [0.7594143350568129, -2.1734867048764177, -0.9332365020690908]]),
'1': np.array([[-1.5331328781538947, 1.5935479192172834, 1.187375410388408], [-0.8102874682552446, 2.2262142387503365, 0.9155184407876742], [1.5188123727445442, -1.571807981612902, -1.1808194683426916], [0.7594143350568129, -2.1734867048764177, -0.9332365020690908]])},
distances=np.array([[3.724317194435888, 3.724317194435888, 3.724317194435888, 3.724317194435888], [3.724317194435888, 3.724317194435888, 3.724317194435888, 3.724317194435888], [3.724317194435888, 3.724317194435888, 3.724317194435888, 3.724317194435888], [3.724317194435888, 3.724317194435888, 3.724317194435888, 3.724317194435888]])),
⋮
It is an open source project, we welcome contributions, bug reports, and suggestions.
Running KinBot involves the submission of a large number of quantum chemistry calculations. The total duration of a KinBot run can be long, depending on the requested calculations, the system size, the number of pathways found, and the available resources.
For the purpose of this workshop, we prepared a set of files from pre-run calculations. To get the material needed open this link for Windows/Mac or this link for Linux/Mac and extract it.
On Linux/Mac type the following command, from the directory where you download it, to extract the files.
tar -xvf kinbot_workshop.tgz
The file is 2 Gb, so it might take a while to download on slow connections. Once uncompressed, it uses 8 Gb of disk space.