Molecular dynamics and Quantum Chemistry are important tools for CADD. I have interested in these topics and OpenMM and Psi4 are nice tool to handing MD and QM.
Today I tried to use psi4-openmm-interface which allows passing of molecular systems between each program.
I reviewed test script and found that the package pass the molecule which optimized with openmm to psi4 as xyz format. And also the package can conduct MD calculation very simply.
I used it, following code is almost same as original repo.
At first, I install the package you can find how to install it in following URL.
https://github.com/mzott/Psi4-OpenMM-Interface
Install psi4, openmm and set PYTHONPASS. It was simple. And I need to modify some code to import the package correctly.
- It uses PeriodicTable class of qcelemental, but the module can’t import _temp_symbol attribute from periodictable.py, so I added _temp_symbol line to qcelemental/periodic_table.py
# qcelemental/periodic_table.py class PeriodicTable: """Nuclear and mass data about chemical elements from NIST. Parameters ---------- None Attributes ---------- A : list of int Mass number, number of protons and neutrons, starting with 0 for dummies. Z : list of int Atomic number, number of protons, starting with 0 for dummies. E : list of str Element symbol from periodic table, starting with "X" for dummies. "Fe" capitalization. EA : list of str Nuclide symbol in E + A form, e.g., "Li6". List `EA` is a superset of `E`; that is, both "Li6" and "Li" present. For hydrogen, "D" and "T" also included. mass : list of :py:class:`decimal.Decimal` Atomic mass [u]. For nuclides (e.g., "Li6"), the reported mass. For stable elements (e.g., "Li"), the mass of the most abundant isotope ("Li7"). For unstable elements (e.g., "Pu"), the mass of the longest-lived isotope ("Pu244"). name : list of str Element name from periodic table, starting with "Dummy". "Iron" capitalization. """ def __init__(self): from . import data # Of length number of elements self.Z = data.nist_2011_atomic_weights["Z"] self.E = data.nist_2011_atomic_weights["E"] self.name = data.nist_2011_atomic_weights["name"] self._el2z = dict(zip(self.E, self.Z)) self._z2el = collections.OrderedDict(zip(self.Z, self.E)) self._element2el = dict(zip(self.name, self.E)) self._el2element = dict(zip(self.E, self.name)) # Of length number of isotopes self._EE = data.nist_2011_atomic_weights["_EE"] self.EA = data.nist_2011_atomic_weights["EA"] self.A = data.nist_2011_atomic_weights["A"] self.mass = data.nist_2011_atomic_weights["mass"] self._eliso2mass = dict(zip(self.EA, self.mass)) self._eliso2el = dict(zip(self.EA, self._EE)) self._eliso2a = dict(zip(self.EA, self.A)) self._el2a2mass = collections.defaultdict(dict) for EE, m, A in zip(self._EE, self.mass, self.A): self._el2a2mass[EE][A] = float(m) # following code was added self._temp_symbol = [ "X", "H", "HE", "LI", "BE", "B", "C", "N", "O", "F", "NE", "NA", "MG", "AL", "SI", "P", "S", "CL", "AR", "K", "CA", "SC", "TI", "V", "CR", "MN", "FE", "CO", "NI", "CU", "ZN", "GA", "GE", "AS", "SE", "BR", "KR", "RB", "SR", "Y", "ZR", "NB", "MO", "TC", "RU", "RH", "PD", "AG", "CD", "IN", "SN", "SB", "TE", "I", "XE", "CS", "BA", "LA", "CE", "PR", "ND", "PM", "SM", "EU", "GD", "TB", "DY", "HO", "ER", "TM", "YB", "LU", "HF", "TA", "W", "RE", "OS", "IR", "PT", "AU", "HG", "TL", "PB", "BI", "PO", "AT", "RN", "FR", "RA", "AC", "TH", "PA", "U", "NP", "PU", "AM", "CM", "BK", "CF", "ES", "FM", "MD", "NO", "LR", "RF", "DB", "SG", "BH", "HS", "MT", "DS", "RG", "UUB", "UUT", "UUQ", "UUP", "UUH", "UUS", "UUO" ] .....snip...
2. Line 81 of psiomm/molecule.py should be change below.
# from #z_vals.append(periodictable.el2z[Zxyz_list[0].upper()]) #To z_vals.append(periodictable._el2z[Zxyz_list[0].upper()])
3. gaff2.xml which is forcefield parameters is placed in openmm installed path openmm/app/data/. Because for my env, gaff2.xml file couldn’t find in the folder.
4. Got cov_radii.py from qcdb/qcdb and placed in psi4/driver/qcdb folder.
Now ready. Let’s write code.
Import some packages at first I used psikit for molecular geometry generation.
import numpy as np from psikit import Psikit from psiomm import molecule from psiomm import psi_omm as po from simtk.openmm.app import Simulation from simtk.openmm.app import ForceField from simtk.openmm.app import Modeller from simtk.openmm.app import PDBReporter from simtk.openmm import Platform from simtk.openmm import LangevinIntegrator from simtk.openmm import Vec3 from simtk.unit import picosecond from simtk.unit import kelvin from simtk.unit import kilocalorie_per_mole from simtk.unit import angstrom from rdkit import Chem from rdkit.Chem import AllChem from rdkit.Chem.Draw import IPythonConsole
Work flow
-
make molecule from xyz string => solute
-
generate bonds
-
generate atom types
-
generate charges
-
make topology
-
define forcefield
-
generate template with FF, topology and solute
-
make modeller
-
perform simulation to get trajectory
pk = Psikit() pk.read_from_smiles('CCOc1cccnc1') pk.rdkit_optimize()
Psikit generate xyz strings with charge and multiplicity information but psiomm can’t read the line so I remove them and passed the string to psiomm. Then generate some molecular properties.
# remove mutiplicity, formal charge xyzstring = '\n'.join(pk.mol2xyz().split('\n')[2:]) solute = molecule.Molecule.from_xyz_string(xyzstring) solute.generate_bonds() solute.generate_atom_types() solute.generate_charges() # check generated atoms and charges for i, at in enumerate(solute.atom_types): print(i, at, solute.charges[i]) 0 c3 -0.16240415191041357 1 c3 0.03679384783580364 2 os -0.263056064493707 ---snip--- 17 h4 0.07500144690962385
Following workflow is based on openmm. GAFF2 is implemented in AMBER and TIP3P is water model which is implemented in CHARMM.
topology = po.make_topology(solute) forcefield = ForceField('gaff2.xml', 'tip3p.xml') po.generateTemplate(forcefield, topology, solute) coords = [] for i in range(len(solute.xyz)): print(solute.xyz[i]) coords.append(Vec3(solute.xyz[i][0], solute.xyz[i][1], solute.xyz[i][2])) positions = coords * angstrom
Make modeller with topology and positions. I added 50 water molecules to the model. And defined the simulation system.
modeller = Modeller(topology, positions) modeller.addSolvent(forcefield, numAdded=50) omm_sys = forcefield.createSystem(modeller.topology) integrator = LangevinIntegrator(300*kelvin, 1/picosecond, 0.002*picosecond) simulation = Simulation(modeller.topology, omm_sys, integrator, platform=Platform.getPlatformByName('Reference'))
I used PDBReporter module to store the simulation. Simulation step set 400 and reporter object save data each 10 step. So I could get output pdb with 40 (400 /10 ) states.
simulation.context.setPositions(modeller.positions) simulation.reporters.append(PDBReporter('output.pdb', 10)) simulation.step(400) simulation.minimizeEnergy(tolerance=2*kilocalorie_per_mole)

It worked fine. following code didn’t use psi4 based calculation, to do it I need to modify input file because last data has not only target molecule but also 50 water molecules. I’ll check the package and psi4 more.
Today’s code can find from my gist/github repo.
https://github.com/iwatobipen/playground/blob/master/openmm-psikit.ipynb