I posted blog about compare exit vector distance of two molecules some days ago.
I used cartesian system to calculate distance of two molecules before. Today I tried to make polar plot using RDKit and matplotlib. ;-)
I used 3d amine dataset and made my old code.
At first I make scatter plot of similarity and exit vector distance.
%matplotlib inline import matplotlib.pyplot as plt from rdkit import Chem from rdkit.Chem import Draw from rdkit.Chem.Draw import IPythonConsole from rdkit.Chem import AllChem from rdkit.Chem import rdMolTransforms from rdkit.Chem import DataStructs import numpy as np from rdkit.Chem import Draw import seaborn as sns import pandas as pd import matplotlib.pyplot as plt sdf = Chem.SDMolSupplier( "3d_amine.sdf", removeHs=False ) mols = [ m for m in sdf ] fps = [ AllChem.GetMorganFingerprintAsBitVect( mol,2 ) for mol in mols ] mol = mols[1] mol.GetSubstructMatch( Chem.MolFromSmarts( "N[H]" ) ) atoms = [atom for atom in mol.GetAtoms()] def getev( mol ): if mol.GetNumConformers() >= 1: matches = mol.GetSubstructMatches( Chem.MolFromSmarts( "N[H]" ) ) conf = mol.GetConformer() theta = rdMolTransforms.GetDihedralDeg( conf, matches[0][1], matches[0][0], matches[1][0], matches[1][1] ) temp_phi1 = 180 - rdMolTransforms.GetAngleDeg(conf, matches[1][0], matches[0][0], matches[0][1] ) temp_phi2 = 180 - rdMolTransforms.GetAngleDeg(conf, matches[0][0], matches[1][0], matches[1][1] ) if temp_phi1 >= temp_phi2: phi1 = temp_phi1 phi2 = temp_phi2 else: phi1 = temp_phi2 phi2 = temp_phi1 r = rdMolTransforms.GetBondLength( conf, matches[0][0], matches[1][0] ) return theta, phi1, phi2, r else: print( "No conformer!" ) def transform_cartegian( theta, phi1, phi2, r ): theta = np.deg2rad( theta ) phi1 = np.deg2rad( phi1 ) phi2 = np.deg2rad( phi2 ) x = np.sin( theta ) * np.sin( phi1 ) * np.sin( phi2 ) *r y = np.sin( theta ) * np.sin( phi1 ) * np.cos( phi2 ) *r z = np.sin( theta ) * np.cos( phi1 ) *r t = np.cos( theta ) *r return x, y, z, t def get_dist(v1,v2): v1 = np.asarray( v1 ) v2 = np.asarray( v2 ) delta = v1 - v2 d = np.linalg.norm( delta ) return d def calc_distance( mol1, mol2 ): theta1, phi11, phi21, r1 = getev( mol1 ) theta2, phi12, phi22, r2 = getev( mol2 ) cart1 = transform_cartegian( theta1, phi11, phi21, r1 ) cart2 = transform_cartegian( theta2, phi12, phi22, r2 ) d = get_dist( cart1, cart2 ) return d df = pd.DataFrame(dataset, columns=['dist','sim']) g=sns.lmplot('dist','sim',df, fit_reg=False)
OK, next I made polar plot.
datasets = [ getev(mol) for mol in mols ] datadict = { 'theta': [data[0] for data in datasets ] , 'phi1': [data[1] for data in datasets ], 'phi2': [data[2] for data in datasets ], 'r': [data[3] for data in datasets ] } df = pd.DataFrame(datadict) df.head()
I got dataframe that has phi, r, theta.
idx phi1 phi2 r theta
0 95.144560 73.623772 3.146983 -72.921352
1 96.238121 68.160414 3.413222 154.397291
2 123.369913 79.976269 4.396864 69.708195
3 79.441975 79.441975 2.953801 180.000000
4 72.223406 70.292055 3.744892 -109.089956
Function of subplot can make multi plots, the three digit number indicates that number of row, number of column and index. So following code, I make three polar plot in one row and three columns.
‘polar = True’ makes polar plot.
polarplot = plt.subplot(131,polar=True) polarplot.scatter( df.theta, df.r,color='r' ) plt.legend(['theta'],bbox_to_anchor=(1.00, 0)) polarplot = plt.subplot(132,polar=True) polarplot.scatter( df.phi1, df.r, color='g' ) plt.legend(['phi1'],bbox_to_anchor=(1.00, 0)) polarplot = plt.subplot(133,polar=True) polarplot.scatter( df.phi2, df.r, color='b' ) plt.legend(['phi2'],bbox_to_anchor=(1.00, 0)) plt.show()
Some example of molecules that used in the code….
The plot is interesting for me!
I pushed the snippet to my repo.
https://github.com/iwatobipen/rdkit_evp/blob/master/samplescript/exit_vector_radialplot.ipynb