Moleculer Similarity

Sometime, we discuss about molecular similarity.
I think that meaning of similar is depend on a situation.
For example, if aromatic pharmacophore is important, phenyl and pyridil maybe similar.
But If molecular charge is important, phenyl ando pyridil maybe unsimilar.
So, Having some metrics methodologies are useful.
RDKit has interesting fingerprint called “Fraggle Fingerprint”
It’s easy to use.
Let’s coding.

from rdkit import Chem
from rdkit.Chem import AllChem, DataStructs
from rdkit.Chem.Fraggle import FraggleSim
# define TanimotoSim calculator for convinience.
def calctc(mol1,mol2):
    fp1=AllChem.GetMorganFingerprintAsBitVect(mol1,2)
    fp2=AllChem.GetMorganFingerprintAsBitVect(mol2,2)
    return DataStructs.TanimotoSimilarity(fp1,fp2)
# make molecule from smiles.
mol=Chem.MolFromSmiles("c1ccccc1Cc1ccccc1")
mol2=Chem.MolFromSmiles("c1ccccc1Nc1ccccc1")
# calc. molecular similarity like ECFP4.
In [26]: calctc(mol,mol2)
Out[26]: 0.3333333333333333
#only N,C difference but low similarity !

#calc Fraggle sim.
In [27]:FraggleSim.GetFraggleSimilarity(mol,mol2)
Out[27]: (1.0, '[*]c1ccccc1.[*]c1ccccc1')
# near my feeling.

Fraggle sim. method is more fuzzy calculation method, but it’s acceptable to medchem.
I think the method near to mol-framework.

If you interested in this function.
You can get more useful PDF from here.

Advertisements

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s