calculate exit vector distance of each molecule

Some days ago, I participated a seminar about drug discovery.
All presentations were nice, and I interested in one of unique method to representation chemical space, named “exit vector plot”.
ref is following
Following Ramachandran: exit vector plots (EVP) as a tool to navigate chemical space covered by 3D bifunctional scaffolds. The case of cycloalkanes
http://pubs.rsc.org/en/content/articlelanding/2016/ra/c5ra19958a

The concept is, chemical space is represented as two vectors n1, n2 with 4 parameters phi1, phi2, theta( dihedral angle ), r.
It’s like a ramachandran plot that is used to plot structure of protein.

image

In this work, the author calculate exit vector about several ring systems and clustering them.
I think this concept is very useful because if chemist want to replace a ring system to another one, it’s difficult to search molecules that shows low structure similarity but has high exit vector similarity.

So, if we can search molecules use exit vector similarity, it will be new way to expand chemical space.
In the paper, the author clustering molecules using Euclidean distance of exit vector. Distance is almost dissimilarity.
Now let’s try to write exit vector calculator using RDKit!
In following code, I focused in 2nd-diamine set because it was easy to define two vectors(n1, n2).
First I prepare sample sdf that has energy minimized 3D structure.(there is not in the code)
Then write code!
RDKit has cool function to calculate distance and angle in rdMolTransform class.

from rdkit import Chem
from rdkit.Chem import Draw
from rdkit.Chem.Draw import IPythonConsole
from rdkit.Chem import AllChem
from rdkit.Chem import rdMolTransforms
import numpy as np
from rdkit.Chem import Draw
sdf = Chem.SDMolSupplier( "3d_amine.sdf", removeHs=False )
mols = [ m for m in sdf ]
mol = mols[1]
mol.GetSubstructMatch( Chem.MolFromSmarts( "N[H]" ) )
atoms = [atom for atom in mol.GetAtoms()]

def getev( mol ):
    if mol.GetNumConformers() >= 1:
        matches = mol.GetSubstructMatches( Chem.MolFromSmarts( "N[H]" ) )
        conf = mol.GetConformer()
        theta = rdMolTransforms.GetDihedralDeg( conf,
                                                matches[0][1],
                                                matches[0][0],
                                                matches[1][0],
                                                matches[1][1]  )
        temp_phi1 = 180 - rdMolTransforms.GetAngleDeg(conf,
                                           matches[1][0],
                                           matches[0][0],
                                           matches[0][1]
                                          )
        temp_phi2 = 180 - rdMolTransforms.GetAngleDeg(conf,
                                           matches[0][0],
                                           matches[1][0],
                                           matches[1][1]
                                          )
        if temp_phi1 >= temp_phi2:
            phi1 = temp_phi1
            phi2 = temp_phi2
        else:
            phi1 = temp_phi2
            phi2 = temp_phi1

        r = rdMolTransforms.GetBondLength( conf, matches[0][0], matches[1][0] )
        return theata, phi1, phi2, r
    else:
        print( "No conformer!" )

def transform_cartegian( theta, phi1, phi2, r ):
    theta = np.deg2rad( theta )
    phi1 = np.deg2rad( phi1 )
    phi2 = np.deg2rad( phi2 )
    x = np.sin( theta ) * np.sin( phi1 ) * np.sin( phi2 ) *r
    y = np.sin( theta ) * np.sin( phi1 ) * np.cos( phi2 ) *r
    z = np.sin( theta ) * np.cos( phi1 ) *r
    t = np.cos( theta ) *r
    return x, y, z, t

def get_dist(v1,v2):
    v1 = np.asarray( v1 )
    v2 = np.asarray( v2 )
    delta =  v1 - v2
    d = np.linalg.norm( delta )
    return d

def calc_distance( mol1, mol2 ):
    theta1, phi11, phi21, r1 = getev( mol1 )
    theta2, phi12, phi22, r2 = getev( mol2 )
    cart1 = transform_cartegian( theta1, phi11, phi21, r1 )
    cart2 = transform_cartegian( theta2, phi12, phi22, r2 )
    d = get_dist( cart1, cart2 )
    return d

All done!
Then run code.

for i, mol in enumerate( mols ):
    d = calc_distance(mols[3],mol)
    print( 3, i, d )

I got following response.
idx1 idx2 distace
3 0 4.90798883048
3 1 1.48017700543
3 2 6.08814365316
3 3 0.0
3 4 3.93873942603
3 5 5.45068906229
3 6 1.40146560569
3 7 0.0619944778129
3 8 6.92182054937
3 9 6.22560061852
3 10 6.19913412456
3 11 5.75093874617
3 12 5.38531808804
3 13 1.85179121146
3 14 1.80235655185
idx1, 6, 7 are near to idx4.
Let see…

Draw.MolsToGridImage( [mols[3],mols[1], mols[6], mols[7]] )
Draw.MolsToGridImage( mols )

I got following images.
It’s seems works fine.

selected mols
all mols

I uploaded the code to github.
https://github.com/iwatobipen/rdkit_evp

広告

コメントを残す

以下に詳細を記入するか、アイコンをクリックしてログインしてください。

WordPress.com ロゴ

WordPress.com アカウントを使ってコメントしています。 ログアウト / 変更 )

Twitter 画像

Twitter アカウントを使ってコメントしています。 ログアウト / 変更 )

Facebook の写真

Facebook アカウントを使ってコメントしています。 ログアウト / 変更 )

Google+ フォト

Google+ アカウントを使ってコメントしています。 ログアウト / 変更 )

%s と連携中