Recently There are lots of python libraries for chemoinformatics and machine learning. One of my favorites is RDKit. ;-)
These area is still active. And today I tried new library named “ODDT” open drug discovery toolkit.
Reference URL is
https://jcheminf.springeropen.com/articles/10.1186/s13321-015-0078-2.
ODDT is well documented in http://oddt.readthedocs.io/en/latest/index.html?highlight=InteractionFingerprint. ⭐️
Oddt is implemented shape and electronic similarities!! I have never known the open source library that implemented electronic similarity. And the library also implemented function that can detect protein ligand interaction.
So, I tried to use oddt.
First, calculation of some similarities codes are below.
To calculate electroshape, just use shape.electroshape method.
from oddt import toolkit from oddt import shape from oddt import fingerprints from rdkit.Chem import Draw mols = toolkit.readfile( 'sdf', 'cdk2.sdf' ) mols = [ m for m in mols ] print(len( mols )) [out] 47 e_shapes = [ shape.electroshape( mol ) for mol in mols ] usrcats = [ shape.usr_cat( mol ) for mol in mols ] usrs = [ shape.usr( mol ) for mol in mols ]
To calculate similarity, just use usr_similarity method.
Following.
for i in range( len( mols[ :5 ] ) ): for j in range( i ): e_sim = shape.usr_similarity( e_shapes[i], e_shapes[j] ) usrcat_sim = shape.usr_similarity( usrcats[i], usrcats[j] ) usr_sim = shape.usr_similarity( usrs[i], usrs[j]) print( i, j, "e_shim", e_sim, 'usrcat_sim', usrcat_sim,'usr_sim',usr_sim ) 1 0 e_shim 0.879372074943 usrcat_sim 0.742055515733 usr_sim 0.676152090576 2 0 e_shim 0.865690221755 usrcat_sim 0.428271350002 usr_sim 0.686898339111 2 1 e_shim 0.896725884564 usrcat_sim 0.481233989554 usr_sim 0.763231432529 3 0 e_shim 0.766813506629 usrcat_sim 0.609482600031 usr_sim 0.463058006246 3 1 e_shim 0.7349875959 usrcat_sim 0.548950403001 usr_sim 0.459194544856 3 2 e_shim 0.715411936912 usrcat_sim 0.360330544106 usr_sim 0.424537194619 4 0 e_shim 0.810683079155 usrcat_sim 0.62174869307 usr_sim 0.61705827303 4 1 e_shim 0.774077718141 usrcat_sim 0.635441642096 usr_sim 0.694498992613 4 2 e_shim 0.755174336047 usrcat_sim 0.394074936141 usr_sim 0.618174238781 4 3 e_shim 0.931446873697 usrcat_sim 0.780733001638 usr_sim 0.562721912484
OK, next check protein-ligand contact. To do that I prepare protein and ligand file from pdb.
And the read each files and preform calculation.
from oddt import interactions pdb1 = next(toolkit.readfile( 'pdb', '1atp_apo.pdb')) pdb1.protein = True ligand = next( toolkit.readfile('sdf', 'atp.sdf')) proteinatoms, ligandatoms, strict=interactions.hbonds( pdb1, ligand ) proteinatoms['resname'] [out] array(['GLU', 'GLU', 'GLU', 'GLU', 'HOH', 'HOH', 'ARG', 'VAL', 'SER', 'ALA', 'HOH', 'PHE', 'GLY', 'LYS', 'HOH', 'HOH', 'THR'], dtype='<U3')
ODDT also can calculate protein-ligand interaction fingerprint.
IFP = fingerprints.InteractionFingerprint( ligand, pdb1) print( IFP ) [out] array([0, 0, 0, ..., 0, 0, 0], dtype=uint8)
I think oddt is very nice toolkit for chemoinformatics.
I uploaded my code on my github repo.
This might be a very naïve question but at pdb1 = next(toolkit.readfile( ‘pdb’, ‘1atp_apo.pdb’)), I’m getting an error that says “AtomValenceException: Explicit valence for atom # 1246 O, 3, is greater than permitted” and it says “# Set metal coordination (zero order) bond orders to single to prevent adding Hs”. Could you please guide me as to how to fix this error?
I confirmed the message and I think you don’n install openbabel. Could you please install openbabel with conda ? $ conda install -c openbabel openbabel and then try again. I think it will work.
Can’t believe I missed out something this simple. Thanks a lot!!
You are welcome ;)