Enumerate molecules with CXSMILES #RDKit #cheminforamtics #memo

Recent version of RDKit supports not only SMILES but also CXSMILES (chemaxon extended smiles). As name shown that CXSMILES can add lots of informations in SMILES strings but it’s little bit difficult to understand. But I found that is useful to enumerate molecules ;)

For example I would like to enumerate core and two R-groups from SMILES, there are lots of way to do that. MolZip is one of the useful way to do that however annotation is required to enumerate molecules.

Today I tried to enumerate molecules from CXSMILES with molecule enumerator. More details of Enumerator is described in Greg’s great blog post.

OK, let’s start.

I used indole as a core and Cl and methoxy group as R-Groups.

import rdkit
from rdkit import Chem
from rdkit.Chem import Draw
from rdkit.Chem.Draw import IPythonConsole
from rdkit.Chem import rdMolEnumerator
IPythonConsole.drawOptions.addAtomIndices = True
IPythonConsole.drawOptions.baseFontSize = 1.0
IPythonConsole.drawOptions.padding = 0.001
coresmi = 'c1ccc2[nH]ccc2c1'
core = Chem.MolFromSmiles(coresmi)
r1 = 'CO*'
r2 = 'Cl*'
m = Chem.MolFromSmiles(f'{r1}.{r2}.{coresmi}')
m

I tried to add Cl atom to index 5 and index 13 atoms and add methoxy group to index 6 and 7 atoms.

At frist I made mol from CXSMILES with multicenter S-groups.

m2 = Chem.MolFromSmiles(f'{r1}.{r2}.{coresmi} |m:4:5.13,2:6.7|')
m2

m:4:5.13 means [A:4] and aromatic carbon of index 5 and 13 are connected and 2:6.7 means [A:2] and aromatic carbon of index 6 and 7 are connected. Each multiple connection should be separated comma.

After that I could get enumereted molecules with following command.

res = rdMolEnumerator.Enumerate(m2)
res

As you can see enumereted molecules has OMe and Cl groups where user defined postion.

In summary conbination of rdkit and CXSMILES will be powerfull tool for cheminformatics.

I updated today’s notebook on my gist.

Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.

Published by iwatobipen

I'm medicinal chemist in mid size of pharmaceutical company. I love chemoinfo, cording, organic synthesis, my family.

Leave a comment

This site uses Akismet to reduce spam. Learn how your comment data is processed.