Call psi4 from web app! #Psi4 #QuantumChemistry

Psi4 is a python package for quantum chemistry. I like it. Native psi4 has some difficulties for preparing input file. So psikit is useful I think. And I think webmo is useful package for calling psi4. I’ve always been interested webmo. So I tried to use it today.
Basic function of webmo is provided free licence.

At first I registered webmo for getting licence code. https://www.webmo.net/index.html

After registration, I could get webmo download link, licence number, and password.

Second, I installed several packages with apt command.

apt install apache2 apache2-suexec-custom libcgi-pm-perl

Then create directory structure.

$ cd ~
$ mkdir public_html
$ chmod 755 public_html
$ cd public_html
$ mkdir cgi-bin
$ chmod 755 cgi-bin
$ exit
Start apache
# a2enmod userdir
# a2enmod suexec
# a2enmod cgid
# service apache2 start

Then eddit /etc/apache2/mods-available/userdir.conf, to enable cgi script.


        UserDir public_html
        UserDir disabled root

        
                AllowOverride FileInfo AuthConfig Limit Indexes
                Options MultiViews Indexes SymLinksIfOwnerMatch IncludesNoExec ExecCGI #ExecCGI added
                AddHAndler cgi-script .cgi #Added
                Require method GET POST OPTIONS
        


Then restart apache ;)

# service apache2 restart

Then install webmo with perl script.

$ tar xzf WebMO.9.x.xxx.tar.gz
$ cd WebMO.install
$ perl setup.pl

Now ready, after restarting apache, I could access local webmo HP via web browser. Default setting is username is admin, password is blank.

After login as admin, system asked to change admin password. So I changed and the create user for calling Psi4.

sample URL is below
http://localhost/~yourusername/cgi-bin/webmo/login.cgi

You can get more details about installation is following URL.
https://www.webmo.net/support/installation.html

Then from admin top page, I set psi4 as quantum chemistry engine. Click interface tab, change psi4 from disable to enable. Then click edit button and set path for psi4 execution.

Tips, If reader installed psi4 via conda, the path schould be ‘/home/***/anaconda/envs/yourenv’. In the case, psi4 path is /home/***/anaconda/envs/yourenv/bin/psi4. bin/psi4 should not be included setting path!!!

After completing the configuration, I made user for QM calculation and login the system as a user.

Making input file is very easy, import file or draw molecule with implemented molecular editor. I used editor for the test.

After drawing the molecule, minimize energy and symmetrize(if possible) is done from selecting upper menu bar. It’s very easy isn’t it?

Then set job options which defines calculation method, basis function, reference and several QM related options. Also the setting can do from GUI.

After setting the options, input data can check from preview tab.

All setting is done, click submit job to run the calculation.

Following screenshot shows water as an example because benzoic acid required looooooooooooooong calculation time in my hobby PC.

Result can see jobmanager. I calculated water molecule vibration. So I could see calculated IR values and spectra.

The spectrum is not so accurate, the reason is that I used low level basis set I think.

Of course psi4 raw output file can download from webserver.

With free charge licence, it is not able to draw MO. But with commercial licence, rendering MO of molecule is available.

WebMO is very useful tool for education, and wet researchers. Because it is easy to use and rich GUI.

If reader who has interested in WebMO, try it! ;)

Calculate solvent effect in Psi4 #psi4 #quantumchemistry

Recently I use not only chemoinformatics tools but also quantum chemistry tool, my favorite is Psi4. Psi4 has many options and plug-ins for quantum calculation. Most setting of calculation is vacuum, but it actually true. So considering the solvent around the molecules is important.

Can psi4 perform calculation with solvent effect?

Yes!

PCMSolver is plugin for PCM calculation. It can be installed via conda.

$ conda install -c psi4 pcmsolver

PCM means ‘polarizable continuum model’. PCM describes solute-solvent interactions only by means of electrostatics and polarization between the solute molecule and solvent. The solvent is modeled as a dielectric continuum with a given permittivity ε.

To use PCMsolver in psi4, user need to make specific input file for the calculation which should has kind of solvent, basis function, CPCM or IEFPCM, etc.

I made to input file from toulene.mol file which is retrieved from ChEMBL.

input files are below. PCMsolver can not call from python interpreter so I need to make input.dat.

# toluene in water
#! pcm

molecule mol {

0 1
C -2.0705902285428452 0.09342117730391691 0.21438842743524625
C 0.06881622580040792 1.2107631870687914 0.004377262725162135
C -0.043709196272760036 -1.211640074147377 -0.026831077837745607
C -1.3202142335884217 1.271231179761799 0.15648583022799556
C 0.714364020910022 -0.03184758851751915 -0.10370200526939809
C 2.2076605753403253 -0.0996279156897954 -0.22684462930707974
C -1.4325455046382927 -1.1469904802166477 0.1253309895373383
H -3.1452716192643337 0.14180771794289201 0.3335057008466673
H 0.6396267675023236 2.1305648085368674 -0.0295934787763336
H 0.4394292831589224 -2.1792111515476216 -0.08511724686650844
H -1.8144108252945523 2.2311166905139106 0.23329105314598383
H 2.59680405039162 0.7889905119305699 -0.7681174009541791
H 2.5134086150024317 -1.0062726983460601 -0.7912601501934461
H 2.660308829389096 -0.1337122631434785 0.78606242092797
H -2.013676759894258 -2.058593101450173 0.17802430435839692
no_reorient
no_com
units angstrom

}

set {
basis 6-31G**
scf_type pk
pcm true
pcm_scf_type total
}

pcm = {
Units = Angstrom
Medium {
SolverType = IEFPCM
Solvent = Water
}

Cavity {
RadiiSet = UFF
Type = GePol
Scaling = false
Area = 0.3
Mode = Implicit
}
}

energy_pte, wfn = energy(‘scf’, return_wfn=True)

# toluene in benzene
#! pcm

molecule mol {

0 1
C -2.0705902285428452 0.09342117730391691 0.21438842743524625
C 0.06881622580040792 1.2107631870687914 0.004377262725162135
C -0.043709196272760036 -1.211640074147377 -0.026831077837745607
C -1.3202142335884217 1.271231179761799 0.15648583022799556
C 0.714364020910022 -0.03184758851751915 -0.10370200526939809
C 2.2076605753403253 -0.0996279156897954 -0.22684462930707974
C -1.4325455046382927 -1.1469904802166477 0.1253309895373383
H -3.1452716192643337 0.14180771794289201 0.3335057008466673
H 0.6396267675023236 2.1305648085368674 -0.0295934787763336
H 0.4394292831589224 -2.1792111515476216 -0.08511724686650844
H -1.8144108252945523 2.2311166905139106 0.23329105314598383
H 2.59680405039162 0.7889905119305699 -0.7681174009541791
H 2.5134086150024317 -1.0062726983460601 -0.7912601501934461
H 2.660308829389096 -0.1337122631434785 0.78606242092797
H -2.013676759894258 -2.058593101450173 0.17802430435839692
no_reorient
no_com
units angstrom

}

set {
basis 6-31G**
scf_type pk
pcm true
pcm_scf_type total
}

pcm = {
Units = Angstrom
Medium {
SolverType = IEFPCM
Solvent = Benzene
}

Cavity {
RadiiSet = UFF
Type = GePol
Scaling = false
Area = 0.3
Mode = Implicit
}
}

energy_pte, wfn = energy(‘scf’, return_wfn=True)

PCMsolver can has many kinds of solvents listed below.

    1:'Water',
    2:'Propylene Carbonate',
    3:'Dimethylsulfoxide',
    4:'Nitromethane',
    5:'Acetonitrile',
    6:'Methanol',
    7:'Ethanol',
    8:'Acetone',
    9:'1,2-Dichloroethane',
    10:'Methylenechloride',
    11:'Tetrahydrofurane',
    12:'Aniline',
    13:'Chlorobenzene',
    14:'Chloroform',
    15:'Toluene',
    16:'1,4-Dioxane',
    17:'Benzene',
    18:'Carbon Tetrachloride',
    19:'Cyclohexane',
    20:'N-heptane',
    21:'Explicit' 

After making the input files, following step is as same as default psi4 calculation.

$ psi4 input.dat -o output.dat

After waiting few seconds, I could get two output results. Picked up line 355-363.

## toulene in water
@RHF Final Energy: -269.75544369636918

=> Energetics Energetics <=

Nuclear Repulsion Energy = 268.6733324038011688
One-Electron Energy = -895.7789138809855558
Two-Electron Energy = 357.3548941675961714
PCM Polarization Energy = -0.0019460266422257
Total Energy = -269.7526333362304172

Computation Completed

There is difference in PCM Polarization Energy between in water and in toluene, in water it is -0.0052558561183017A.U.(-3.298kcal/mol) and in tolunene, it is -0.0019460266422257A.U.(-1.22kcal/mol)
*1 a.u. =627.50 kcal/mol

I would like to test some molecules. Are there freely available experimental data set? I’ll try to find them.

Added Notebook Example for psikit #psi4 #RDKit

Some days ago. I updated psikit, added new function, SAPT.

The method can calculate inter-intra molecular interaction with psi4. Psikit can call the method very easily.

And I added sample notebook in the repository. Following codes are very simple example of SATP for water dimer and FSAPT for phenol dimer.

SAPT for water dimer.



from rdkit import Chem
from rdkit.Chem.Draw import IPythonConsole
from rdkit.Chem import Draw
from psikit import Sapt

w1 = Chem.MolFromMolFile('water1.mol', removeHs=False)
w2 = Chem.MolFromMolFile('water2.mol', removeHs=False)
Draw.MolsToGridImage([w1,w2])
sapt = Sapt()
sapt.monomer1_from_molfile('water1.mol')
sapt.monomer2_from_molfile('water2.mol')
sapt.make_dimer()
res = sapt.run_sapt()
---stdout is below
Initializing SAPT object...

RHF for monomer A finished in 0.55 seconds.
RHF for monomer B finished in 0.52 seconds.
Building ERI tensor...
...built ERI tensor in 3.180 seconds.
Size of the ERI tensor is 0.36 GB, 82 basis functions.

...finished initializing SAPT object in  4.49 seconds.

Starting electrostatics...
...electrostatics took a total of  0.17 seconds.

Starting exchange...
...exchange took a total of  0.63 seconds.

Starting dispersion...
...dispersion took a total of  8.92 seconds.

Starting induction...
Ind20,r (AB)           -1.37852223 mH       -0.86503511 kcal/mol
Exch-Ind20,r (AB)       0.88580457 mH        0.55585034 kcal/mol
...induction took a total of  15.90 seconds.

SAPT0 Results
----------------------------------------------------------------------
Exch10 (S^2)             10.53844851 mH        6.61297129 kcal/mol
Elst10                  -13.02830646 mH       -8.17537956 kcal/mol
Disp20                   -3.42996225 mH       -2.15233218 kcal/mol
Exch-Disp20               0.61399531 mH        0.38528758 kcal/mol
Ind20,r                  -4.33068313 mH       -2.71754264 kcal/mol
Exch-Ind20,r              2.30125737 mH        1.44405971 kcal/mol
----------------------------------------------------------------------
Total SAPT0              -7.33525065 mH       -4.60293580 kcal/mol

F-SAPT for phenol dimer

from rdkit import Chem
from rdkit.Chem.Draw import IPythonConsole
from rdkit.Chem import Draw
from psikit import Sapt

p1 = Chem.MolFromMolFile('phenol1.mol', removeHs=False)
p2 = Chem.MolFromMolFile('phenol2.mol', removeHs=False)
Draw.MolsToGridImage([p1,p2])
sapt = Sapt()
sapt.monomer1_from_molfile('phenol1.mol')
sapt.monomer2_from_molfile('phenol2.mol')
sapt.make_dimer()
res = sapt.run_fisapt()

--fA.dat and fB.dat files are generated in fsapt folder the  call fsapt.py from terminal, it generates FSAPT results which is shown below.
   => Full Analysis  Reduced Analysis  F-ISAPT: Links 50-50  Full Analysis  Reduced Analysis <=

Frag1     Frag2         Elst     Exch    IndAB    IndBA     Disp    Total 
c1ccccc1_0 c1ccccc1_0    0.686    2.197    0.007   -0.208   -2.403    0.278 
c1ccccc1_0 O_0         -2.751    0.733   -0.147   -0.227   -0.675   -3.067 
O_0       c1ccccc1_0    1.392    0.720    0.222   -0.347   -0.793    1.194 
O_0       O_0         -8.421    6.218   -0.584   -1.512   -1.250   -5.549 
c1ccccc1_0 All         -2.065    2.930   -0.140   -0.435   -3.078   -2.789 
O_0       All         -7.030    6.938   -0.362   -1.859   -2.043   -4.356 
All       c1ccccc1_0    2.078    2.917    0.229   -0.556   -3.196    1.472 
All       O_0        -11.173    6.951   -0.731   -1.739   -1.925   -8.617 
All       All         -9.095    9.868   -0.502   -2.295   -5.121   -7.145 

Psikit can use SAPT, FSAPT efficiently I think. There is room for improvement for FSAPT fragmentation method. Any comments and advice will be greatly appreciated.

URL is below.

GITHUB
https://github.com/Mishima-syk/psikit
FSAPT
https://nbviewer.jupyter.org/github/Mishima-syk/psikit/blob/master/examples/example_sapt/fsapt_ex.ipynb
SAPT
https://nbviewer.jupyter.org/github/Mishima-syk/psikit/blob/master/examples/example_sapt/sapt_ex.ipynb

Comparison of rdChemReactions and EditableMol #RDKit #chemoinformatics

In this year I moved from MedChem team to CompChem team. And now I need to learn SBDD. Today I struggled mol object that has 3D information.

I would like to replace hydrogen which attached aromatic carbon to some atoms. I thought it is easy if I use rdChemReactions method. But I found that it was not good approach. Because RunReactants method generates products but the method can not keep 3D information of reacted atoms.

It is very interesting and good information for me. Let see example code.

Following code is very simple example.

code example

At first, I tried to replace atom with rdChemReactions method. It worked well but the position of fluorine atom was 0, 0, 0. It indicates that the method can not keep 3D information. I think it is reasonable because some chemical reaction dramatically changes molecular conformation.

The second example used Editable mol.

This approach could keep 3D information, it means that the method do just replace atom!

Of course bond length of C-H and C-F is different but the approach is more suitable for atom scanning I think.

I always surprised that RDKit has many useful function for drug design.

Visualize Molecular Orbital with pymol and psikit #RDKit #psi4 #psikit #pymol

I posted how to visualize MO with VMD, the data was generated from psikit.

I used VMD because psi4 provides visualize function for cubefile and I could not find the same method on pymol. Pymol is familier for me than VMD. So I would like to do same thing on pymol. And today I could do it.

It is very easy! Let’s try it. At first, geometry optimize and generate cubefile of acetic acid and tetrazole.

import sys
sys.path.append('/Users/iwatobipen/develop/chemoenv/psikit/psikit')
from psikit import Psikit
from rdkit import Chem
pk = Psikit()
# acetic acid
pk.read_from_smiles("CC(=O)[O-]")
pk.optimize()
Chem.MolToMolFile(pk.mol, 'acetic_acid.mol')
pk.getMOview()

# tetrazole
pk.read_from_smiles("CC1=NN=N[N-]1")
pk.optimize()
pk.getMOview()

The getMOview() to generate MO cube files. The function generates 4 files named Psi_a/b_n_n_A.cube. n is number of alpha electrons and number of alpha electrons + 1. It means HOMO and LUMO.

After file generation, launch pymol and load mol file and cubefiles. For acetic acid, acetic_acid.mol, Psi_a_16_16-A.cube and Psi_b_16_16-A.cube files are loaded for HOMO drawing.

Then, draw isomesh and set color from pymol command line.

isomesh HOMO_A,  Psi_a_16_16-A, -0.02
isomesh HOMO_B,  Psi_b_16_16-A, 0.02
color blue, HOMO_A
color red, HOMO_B

Now I could get following image on pymol.

And I could get HOMO of tetrazole with same manner.

Pymol can draw ligand-protein complex easily. I think it is interesting for rendering ligand MO on protein-ligand complex and think about new drug design.

Visualize HOMO LUMO with psi4 #RDKit #psi4 #psikit

Now thin wrapper of psi4 named psikit can generate cube file which has frontier orbital information. After calling getMOview, I would like to check the orbital shape.

Psi4 provides vmd_cube.py script which generates cool view on VMD. To run the script on python3, it is needed to change line 332 from ‘for k,v in options.iteritems():’ to ‘for k,v in options.items():’.

After changed the code. Ready to visualize molecular orbital! Let’s check it.
I run the following code on jupyter notebook.

import os
import sys
import pandas as pd
import numpy as np
from rdkit import Chem
from rdkit.Chem.Draw import IPythonConsole
sys.path.append('/Users/iwatobipen/develop/chemoenv/psikit/psikit/')
from psikit import Psikit
pk = Psikit()
pk.read_from_smiles('COc1ccncc1')
pk.optimize()
>Optimizer: Optimization complete!
>-360.5913889506649

Then call getMOview() to generate MO cube files.

Next, run the vmd_cube.py. The script has many options. To check the option, vmd_cube.py -h gives help.

I run the code interactive mode. After run the code, I could see MO on VMD.

To run the code, reader need install VMD in your PC. Of course the script generates image of each MO.

I think it is interesting that visualize MO of protein-ligand complex because it will show interaction of protein and ligand. It is very useful for understanding of the interaction. Any comments or suggestions are greatly appreciated. ;)

Calculate atomic charges with psikit #RDKit #psi4

Recently I implemented new function in psikit for atomic charge calculation. Now user can get mulliken charges, RESP charges and Lowdin charges very easily. There is quick example below.

At first import libraries.

import os
import sys
from rdkit import Chem
from rdkit.Chem import Draw
from rdkit.Chem.Draw import IPythonConsole
import pandas as pd
import numpy as np
from collections import defaultdict
sys.path.append('/Users/iwatobipen/develop/chemoenv/psikit/psikit')
from psikit import Psikit

I used methanol and imidazole for demo. And psikit can generate 3D structure from SMILES and optimize the conformer with RDKit. Then optimize geometry with psi4.

First example is methanol

pk = Psikit()
methanol = 'CO'
imidazole = 'C1=CN=CN1'
pk.read_from_smiles(methanol)
pk.optimize()
# from literature 6-31G*
# C 0.21 0.04
# O 0.61 0.52
# H 0.13-0.16  0.04
# H 0.39 0.35
resp = pk.resp_charges
mulliken = pk.mulliken_charges
res = defaultdict(list)
for i, atom in enumerate(pk.mol.GetAtoms()):
    res['SYMBOL'].append(atom.GetSymbol())
    res['MULLIKEN'].append(mulliken[i])
    res['RESP'].append(resp[i])
    #print('{:.3f} {:.3f} {}'.format(mulliken[i], resp[i], atom.GetSymbol()))
methanol_df = pd.DataFrame(res)

Oxygen atom shows negative charge. It seems reasonable I think but the values are far from the reference values.

Second example is imidazole.

pk.read_from_smiles(imidazole)
pk.optimize()

resp = pk.resp_charges
mulliken = pk.mulliken_charges
# from literature 6-31G*
#     MULLIKEN RESP
# C 0.02 0.24 
# H 0.15 0.16 
# N 0.55 0.10
# C 0.21 0.06
# H 0.15 0.12
# C 0.03 0.05
# H 0.14 0.11
# N 0.42 0.42
# H 0.33 0.26

#Basis set 6-31G**
res = defaultdict(list)
for i, atom in enumerate(pk.mol.GetAtoms()):
    res['SYMBOL'].append(atom.GetSymbol())
    res['MULLIKEN'].append(mulliken[i])
    res['RESP'].append(resp[i])
    #print('{:.3f} {:.3f} {}'.format(mulliken[i], resp[i], atom.GetSymbol()))
imidazole_df = pd.DataFrame(res)

Two nitrogen atoms shows negative values in both Mulliken/RESP methods.

Reference data came from the URL. I searched mulliken charges of both molecules in google. I found some reports about that but all values are slightly different. I would like to know the accurate value of atomic charges of these molecules. And would like to know how to calculate spin multiplicity with RDKit. Any advices are greatly appreciated. ;)