Tips for MCS of RDKit

Find MCS is useful function for me, because sometime I want to extract common substructure from compounds.
But, in the case of large amount of compounds set give me boring results like a ethyl and so on. It’s no wonder.

FindMCS function of RDKit has unique solution to solve that. To use “threshold” option I can define the maximum number of molecules that need to calculate MCS.
I found tips, the result of FindMCS with the option depends on order of molecules.
See following codes….

from rdkit import Chem
from rdkit.Chem import MCS
from rdkit.Chem.Draw import IPythonConsole
from rdkit import RDConfig
from rdkit.Chem import FragmentCatalog

mol1 = Chem.MolFromSmiles("Cc1ccccc1")
mol2 = Chem.MolFromSmiles( "CCc1ccccc1" )
mol3 = Chem.MolFromSmiles( "Oc1ccccc1" )
mol4 = Chem.MolFromSmiles( "COc1ccccc1" )
Draw.MolsToGridImage([mol1,mol2,mol3,mol4])

res1

OK get MCS.

res = MCS.FindMCS([mol1,mol2,mol3,mol4], threshold=0.5)
res2 = MCS.FindMCS([mol4,mol3,mol2,mol1], threshold=0.5)
Chem.MolFromSmarts(res.smarts)

res2
Next…

Chem.MolFromSmarts(res2.smarts)

res3

Different order of molecules gave different result. I will keep that mind!!!

Advertisements

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s