Comparison of rdChemReactions and EditableMol #RDKit #chemoinformatics

In this year I moved from MedChem team to CompChem team. And now I need to learn SBDD. Today I struggled mol object that has 3D information.

I would like to replace hydrogen which attached aromatic carbon to some atoms. I thought it is easy if I use rdChemReactions method. But I found that it was not good approach. Because RunReactants method generates products but the method can not keep 3D information of reacted atoms.

It is very interesting and good information for me. Let see example code.

Following code is very simple example.

code example

At first, I tried to replace atom with rdChemReactions method. It worked well but the position of fluorine atom was 0, 0, 0. It indicates that the method can not keep 3D information. I think it is reasonable because some chemical reaction dramatically changes molecular conformation.

The second example used Editable mol.

This approach could keep 3D information, it means that the method do just replace atom!

Of course bond length of C-H and C-F is different but the approach is more suitable for atom scanning I think.

I always surprised that RDKit has many useful function for drug design.

Advertisements

Visualize Molecular Orbital with pymol and psikit #RDKit #psi4 #psikit #pymol

I posted how to visualize MO with VMD, the data was generated from psikit.

I used VMD because psi4 provides visualize function for cubefile and I could not find the same method on pymol. Pymol is familier for me than VMD. So I would like to do same thing on pymol. And today I could do it.

It is very easy! Let’s try it. At first, geometry optimize and generate cubefile of acetic acid and tetrazole.

import sys
sys.path.append('/Users/iwatobipen/develop/chemoenv/psikit/psikit')
from psikit import Psikit
from rdkit import Chem
pk = Psikit()
# acetic acid
pk.read_from_smiles("CC(=O)[O-]")
pk.optimize()
Chem.MolToMolFile(pk.mol, 'acetic_acid.mol')
pk.getMOview()

# tetrazole
pk.read_from_smiles("CC1=NN=N[N-]1")
pk.optimize()
pk.getMOview()

The getMOview() to generate MO cube files. The function generates 4 files named Psi_a/b_n_n_A.cube. n is number of alpha electrons and number of alpha electrons + 1. It means HOMO and LUMO.

After file generation, launch pymol and load mol file and cubefiles. For acetic acid, acetic_acid.mol, Psi_a_16_16-A.cube and Psi_b_16_16-A.cube files are loaded for HOMO drawing.

Then, draw isomesh and set color from pymol command line.

isomesh HOMO_A,  Psi_a_16_16-A, -0.02
isomesh HOMO_B,  Psi_b_16_16-A, 0.02
color blue, HOMO_A
color red, HOMO_B

Now I could get following image on pymol.

And I could get HOMO of tetrazole with same manner.

Pymol can draw ligand-protein complex easily. I think it is interesting for rendering ligand MO on protein-ligand complex and think about new drug design.

Visualize HOMO LUMO with psi4 #RDKit #psi4 #psikit

Now thin wrapper of psi4 named psikit can generate cube file which has frontier orbital information. After calling getMOview, I would like to check the orbital shape.

Psi4 provides vmd_cube.py script which generates cool view on VMD. To run the script on python3, it is needed to change line 332 from ‘for k,v in options.iteritems():’ to ‘for k,v in options.items():’.

After changed the code. Ready to visualize molecular orbital! Let’s check it.
I run the following code on jupyter notebook.

import os
import sys
import pandas as pd
import numpy as np
from rdkit import Chem
from rdkit.Chem.Draw import IPythonConsole
sys.path.append('/Users/iwatobipen/develop/chemoenv/psikit/psikit/')
from psikit import Psikit
pk = Psikit()
pk.read_from_smiles('COc1ccncc1')
pk.optimize()
>Optimizer: Optimization complete!
>-360.5913889506649

Then call getMOview() to generate MO cube files.

Next, run the vmd_cube.py. The script has many options. To check the option, vmd_cube.py -h gives help.

I run the code interactive mode. After run the code, I could see MO on VMD.

To run the code, reader need install VMD in your PC. Of course the script generates image of each MO.

I think it is interesting that visualize MO of protein-ligand complex because it will show interaction of protein and ligand. It is very useful for understanding of the interaction. Any comments or suggestions are greatly appreciated. ;)

Calculate atomic charges with psikit #RDKit #psi4

Recently I implemented new function in psikit for atomic charge calculation. Now user can get mulliken charges, RESP charges and Lowdin charges very easily. There is quick example below.

At first import libraries.

import os
import sys
from rdkit import Chem
from rdkit.Chem import Draw
from rdkit.Chem.Draw import IPythonConsole
import pandas as pd
import numpy as np
from collections import defaultdict
sys.path.append('/Users/iwatobipen/develop/chemoenv/psikit/psikit')
from psikit import Psikit

I used methanol and imidazole for demo. And psikit can generate 3D structure from SMILES and optimize the conformer with RDKit. Then optimize geometry with psi4.

First example is methanol

pk = Psikit()
methanol = 'CO'
imidazole = 'C1=CN=CN1'
pk.read_from_smiles(methanol)
pk.optimize()
# from literature 6-31G*
# C 0.21 0.04
# O 0.61 0.52
# H 0.13-0.16  0.04
# H 0.39 0.35
resp = pk.resp_charges
mulliken = pk.mulliken_charges
res = defaultdict(list)
for i, atom in enumerate(pk.mol.GetAtoms()):
    res['SYMBOL'].append(atom.GetSymbol())
    res['MULLIKEN'].append(mulliken[i])
    res['RESP'].append(resp[i])
    #print('{:.3f} {:.3f} {}'.format(mulliken[i], resp[i], atom.GetSymbol()))
methanol_df = pd.DataFrame(res)

Oxygen atom shows negative charge. It seems reasonable I think but the values are far from the reference values.

Second example is imidazole.

pk.read_from_smiles(imidazole)
pk.optimize()

resp = pk.resp_charges
mulliken = pk.mulliken_charges
# from literature 6-31G*
#     MULLIKEN RESP
# C 0.02 0.24 
# H 0.15 0.16 
# N 0.55 0.10
# C 0.21 0.06
# H 0.15 0.12
# C 0.03 0.05
# H 0.14 0.11
# N 0.42 0.42
# H 0.33 0.26

#Basis set 6-31G**
res = defaultdict(list)
for i, atom in enumerate(pk.mol.GetAtoms()):
    res['SYMBOL'].append(atom.GetSymbol())
    res['MULLIKEN'].append(mulliken[i])
    res['RESP'].append(resp[i])
    #print('{:.3f} {:.3f} {}'.format(mulliken[i], resp[i], atom.GetSymbol()))
imidazole_df = pd.DataFrame(res)

Two nitrogen atoms shows negative values in both Mulliken/RESP methods.

Reference data came from the URL. I searched mulliken charges of both molecules in google. I found some reports about that but all values are slightly different. I would like to know the accurate value of atomic charges of these molecules. And would like to know how to calculate spin multiplicity with RDKit. Any advices are greatly appreciated. ;)

Draw HOMO LUMO with psikit-2 #chemoinformatics

Yesterday I wrote a post about psikit function for HOMO LUMO drawing. And @fmkz__ gave me very suggestive comment.

He performed QM calculation about tetrazole and disclosed the its distribution of negative charge.

I had interested in his suggestion and I tried it. By using psikit, I calculated energy of acetic acid and 5-methyl-1H-tetrazole and got HOMO LUMO cube files from the results.

Results is below. At first HOMO and LUMO of acetic acid.

HOMO of acetic acid
LUMO of acetic acid

These results shows that carbonyl C=O has large MO and it means carbonyl is most reactive position of the acetic acid. And also LUMO around the alpha carbon is large. It indicates that the site is reactive with electrophile I think.

Next, let’s see result of tetrazole. Tetrazole is often used in medicinal chemistry as a bioisoster of carboxylic acid.

HOMO of tetrazole
LUMO of tetrazole

These results shows 3-N position has large MO compared to other nitrogen atoms. I think it indicates that 3-N position is more reactive than other nitrogen. And I found good example to explain the result in slide share. URL is below.

https://www.slideshare.net/ShrikantPharande/tetrazole-and-triazole-67637231
Page 21 of the slide shows example of 5-sub tetrazoles alkylation.
The page shows example of reaction between 5-Me-1H-tetrazole and trityl-Cl. And most of alkylation occurred at n-3 position.

Borrowed from slide share

It is very exciting for me because I could know that QM is very useful tool for reactivity prediction. The result motivates me to enhance of psikit. ;-)

Any suggestion and advices are highly appreciated.

Draw HOMO LUMO with psikit #RDKit #Psi4 #PyMol

Now I and @fmkz___ are developing a thin wrapper library for Psi4 and RDKit named psikit.
Today I added new function for viewing molecular orbital HOMO and LUMO with pymol. Psi4 has the function which can output cube format file of MO named ‘cubeprop’. So I try to implement the function in to psikit.
By using the library, you can get HOMO/LUMO cube file easily.
Example is below.

from rdkit import Chem
from rdkit.Chem import Draw
from rdkit.Chem import AllChem
from rdkit.Chem.Draw import IPythonConsole
from psikit import Psikit
smi = 'COc1cccnc1'
mol = Chem.MolFromSmiles(smi)
AllChem.Compute2DCoords(mol)
mol

Just call ‘getMOview’ after optimization. It took a little bit long time on my PC.

pk = Psikit()
pk.read_from_smiles(smi)
%time pk.optimize()
>Optimizer: Optimization complete!
>CPU times: user 1min 54s, sys: 3.97 s, total: 1min 58s
>Wall time: 31.5 s
>-360.58490537900167
# get MO view!
pk.getMOview()

After executing the method, psikit makes 5 files named target.mol, Psi_a_n_n-A.cube, Psi_b_n_n-A.cube, Psi_a_n+1_n+1-A.cube, Psi_b_n+1_n+1-A.cube. First file is mol file. Second and third is information of its HOMO and forth, fifth is information of LUMO.

Then load all files from pymol. Change HOMO/LUMO object show setting as dot. I could get following image.

Optimized molecule
Molecule and HOMO
Molecule and LUMO
Molecule and HOMO/LUMO

The function and pymol seem to work fine. But I want to compare data between this result and another QM’s result.
Current version of psikit is still under development and some bugs. I would like to fix and tune the code.

New functionality of psikit #Chemoinformatics #RDKit #Psi4

Happy new year! ‘Akemashite Omedetou Gozaimasu’ in Japanese!

I and @kzfm_ -san are developing a library which uses a rdkit & psi4 named psikit. You can find brief introduction the concept following URL.
http://blog.kzfmix.com/entry/1536978824

Today I added new function that can calculate RESP charge of give molecule. RESP charge means ‘Restraints for Deriving
Atomic Charges’. Details can read from this link.

OK, I would like to show how to call new function. It is easy to use but it needs long time for large molecule. I used acetic acid as example. To calculate RESP Charge, user need to run optimization.

############################################
# This code came from RESP branch, not master branch.
############################################

from psikit import Psikit
smi = 'CC(=O)O'
# load smiles and perform optimzation
pk = Psikit()
pk.read_from_smiles(smi)
%time pk.optimize()
>Optimizer: Optimization complete!
>CPU times: user 21.6 s, sys: 920 ms, total: 22.5 s
>Wall time: 6.23 s
>-227.82082530474275

Now I can get HOMO, LUMO and RESP Charge. RESP charges are set as atomic properties of pk.mol object. Let’s check before calculation and after calculation.

# Get Atoms from pk.mol object and check atom properties
atoms = pk.mol.GetAtoms()
print(list(atoms[0].GetPropNames()))
> []
# run the calculation
charges = pk.resp_charge
atoms = pk.mol.GetAtoms()
print(list(atoms[0].GetPropNames()))
> ['EP_C', 'RESP_C']  # EP_C means electro static potential, RESP_C means RESP charge.

Finally draw molecule with RESP charges.

from rdkit.Chem.Draw import MolDrawOptions
from rdkit.Chem.Draw import rdMolDraw2D
from IPython.display import SVG
view = rdMolDraw2D.MolDraw2DSVG(600,800)
op = view.drawOptions()
for idx, atom in enumerate(pk.mol.GetAtoms()):
    num = float(atom.GetProp('RESP_C'))
    op.atomLabels[idx] = "{}({:,.2f})".format(atom.GetSymbol(), num)
AllChem.Compute2DCoords(pk.mol)
view.DrawMolecule(pk.mol)
view.FinishDrawing()
svg = view.GetDrawingText().replace('svg:', '')
SVG(svg)
acetic acid with RESP charge 6-31G** basis

The results indicates Oxygen atom has the most negative charge and alpha carbon shows more negative than carbonyl carbon.
The library is now under development. We would like to release the package near the feature.

All code of the post can get from the URL below.

https://nbviewer.jupyter.org/github/Mishima-syk/psikit/blob/resp/psikit/example.ipynb