calculate exit vector distance of each molecule

Some days ago, I participated a seminar about drug discovery.
All presentations were nice, and I interested in one of unique method to representation chemical space, named “exit vector plot”.
ref is following
Following Ramachandran: exit vector plots (EVP) as a tool to navigate chemical space covered by 3D bifunctional scaffolds. The case of cycloalkanes
http://pubs.rsc.org/en/content/articlelanding/2016/ra/c5ra19958a

The concept is, chemical space is represented as two vectors n1, n2 with 4 parameters phi1, phi2, theta( dihedral angle ), r.
It’s like a ramachandran plot that is used to plot structure of protein.

In this work, the author calculate exit vector about several ring systems and clustering them.
I think this concept is very useful because if chemist want to replace a ring system to another one, it’s difficult to search molecules that shows low structure similarity but has high exit vector similarity.

So, if we can search molecules use exit vector similarity, it will be new way to expand chemical space.
In the paper, the author clustering molecules using Euclidean distance of exit vector. Distance is almost dissimilarity.
Now let’s try to write exit vector calculator using RDKit!
In following code, I focused in 2nd-diamine set because it was easy to define two vectors(n1, n2).
First I prepare sample sdf that has energy minimized 3D structure.(there is not in the code)
Then write code!
RDKit has cool function to calculate distance and angle in rdMolTransform class.

```from rdkit import Chem
from rdkit.Chem import Draw
from rdkit.Chem.Draw import IPythonConsole
from rdkit.Chem import AllChem
from rdkit.Chem import rdMolTransforms
import numpy as np
from rdkit.Chem import Draw
sdf = Chem.SDMolSupplier( &quot;3d_amine.sdf&quot;, removeHs=False )
mols = [ m for m in sdf ]
mol = mols[1]
mol.GetSubstructMatch( Chem.MolFromSmarts( &quot;N[H]&quot; ) )
atoms = [atom for atom in mol.GetAtoms()]

def getev( mol ):
if mol.GetNumConformers() &amp;gt;= 1:
matches = mol.GetSubstructMatches( Chem.MolFromSmarts( &quot;N[H]&quot; ) )
conf = mol.GetConformer()
theta = rdMolTransforms.GetDihedralDeg( conf,
matches[0][1],
matches[0][0],
matches[1][0],
matches[1][1]  )
temp_phi1 = 180 - rdMolTransforms.GetAngleDeg(conf,
matches[1][0],
matches[0][0],
matches[0][1]
)
temp_phi2 = 180 - rdMolTransforms.GetAngleDeg(conf,
matches[0][0],
matches[1][0],
matches[1][1]
)
if temp_phi1 &amp;gt;= temp_phi2:
phi1 = temp_phi1
phi2 = temp_phi2
else:
phi1 = temp_phi2
phi2 = temp_phi1

r = rdMolTransforms.GetBondLength( conf, matches[0][0], matches[1][0] )
return theata, phi1, phi2, r
else:
print( &quot;No conformer!&quot; )

def transform_cartegian( theta, phi1, phi2, r ):
x = np.sin( theta ) * np.sin( phi1 ) * np.sin( phi2 ) *r
y = np.sin( theta ) * np.sin( phi1 ) * np.cos( phi2 ) *r
z = np.sin( theta ) * np.cos( phi1 ) *r
t = np.cos( theta ) *r
return x, y, z, t

def get_dist(v1,v2):
v1 = np.asarray( v1 )
v2 = np.asarray( v2 )
delta =  v1 - v2
d = np.linalg.norm( delta )
return d

def calc_distance( mol1, mol2 ):
theta1, phi11, phi21, r1 = getev( mol1 )
theta2, phi12, phi22, r2 = getev( mol2 )
cart1 = transform_cartegian( theta1, phi11, phi21, r1 )
cart2 = transform_cartegian( theta2, phi12, phi22, r2 )
d = get_dist( cart1, cart2 )
return d
```

All done!
Then run code.

```for i, mol in enumerate( mols ):
d = calc_distance(mols[3],mol)
print( 3, i, d )
```

I got following response.
idx1 idx2 distace
3 0 4.90798883048
3 1 1.48017700543
3 2 6.08814365316
3 3 0.0
3 4 3.93873942603
3 5 5.45068906229
3 6 1.40146560569
3 7 0.0619944778129
3 8 6.92182054937
3 9 6.22560061852
3 10 6.19913412456
3 11 5.75093874617
3 12 5.38531808804
3 13 1.85179121146
3 14 1.80235655185
idx1, 6, 7 are near to idx4.
Let see…

```Draw.MolsToGridImage( [mols[3],mols[1], mols[6], mols[7]] )
Draw.MolsToGridImage( mols )
```

I got following images.
It’s seems works fine.

I uploaded the code to github.
https://github.com/iwatobipen/rdkit_evp