Recently I updated my rdkit env from 201703 to 201709 by using conda.
New version of rdkit was implemented cool function named rdRGroupDeompositon.
The function enable us to render RGroups as DataFrame.
I tried to visualize cdk2.sdf dataset.
Code that I wrote is bellow.(using jupyter notebook)
from rdkit import Chem from rdkit.Chem import Draw, AllChem from rdkit.Chem import PandasTools from rdkit.Chem import rdBase from rdkit.Chem import RDConfig from rdkit.Chem.Draw import IPythonConsole import os PandasTools.InstallPandasTools() base = RDConfig.RDDocsDir datapath = os.path.join( base, "Book/data/cdk2.sdf") mols = [ mol for mol in Chem.SDMolSupplier( datapath ) if mol != None ] # mol object that has 3D conformer information did not work well. So I remove the conformation info. for m in mols: tmp = m.RemoveAllConformers() # define core to RG decomposition. core = Chem.MolFromSmiles('[nH]1cnc2cncnc21') from rdkit.Chem import rdRGroupDecomposition tables = PandasTools.LoadSDF( datapath ) rg = rdRGroupDecomposition.RGroupDecomposition( core ) for mol in mols[:5]: rg.Add( mol ) # Do RG deconpositon. rg.Process()
Then visualize RGdecomp result.
import pandas as pd PandasTools.molRepresentation="svg" modlf = PandasTools.LoadSDF( datapath ) frame = pd.DataFrame( rg.GetRGroupsAsColumns() ) frame
Result is following image. ;-)
New version of RDKit is cool & powerful tool for chemoinformatics. I really respect the developer of rdkit.